Recently FINRA fined an investment firm $17m for inadequate investigation of anti-money laundering “red flags”, during a period 2006 to 2014 when they had significant growth in their business. Their successful growth was not accompanied by the growth in systems to assure compliance. They had patch work solutions for large volumes of data resulting in data silos which made it difficult to combine the data for investigations.
The period from 2006 to 2014 is a very long time in terms of the evolution of technology and changes in regulations and financial instruments, so I can see how this situation could easily happen. If you did not have a data management system that could scale, consolidate silos of data, leverage technology advances and respond to regulatory and business changes, compliance can easily get out of hand.
Ten years ago data management for compliance was mainly dependent on CAS (Content Addressable Storage) which was successfully marketed by EMC as Centera. CAS is based on hashing the object (file) and using that hash as the address into the CAS repository. The hash could also be checked on retrieval to show immutability which was a plus for compliance. Another plus was that it had a flat structure and could grow to large capacities for of low cost storage. Access to Centera required an API which made it proprietary, but that did not deter users who saw it as a solution for retention of compliance data. Many ISV were happy to jump in and provide application specific solutions based on the Centera API since it provided them with a proprietary lockin.
Hitachi Data Systems offering at that time was a product called HCAP (Hitachi Content Archive Platform), which was developed in partnership with Archivas. Hitachi and Archivas took the approach of indexing the metadata and content as the files were written to the archive so that HCAP had awareness of the content and could provide event-based updating of the full text and metadata index as retention status changed or as files were deleted. Hashing of the object was also provided, but for immutability, not for addressing. While proprietary CAS solutions focused on storing content, HCAP was focused on accessing it. The interfaces to HCAP were non-proprietary, supporting NFS, CIFS, SMTP, WebDAV, and HTTP and supported policy-based integration from many distributed or centralized repositories such as e-mail, file systems, databases, applications and content or document management systems. The elimination of silos enabled users to leverage a set of common and unified archive services such as centralized search, policy-based retention, authentication and protection.
In 2007 Hitachi Data Systems acquired Archivas and shortly thereafter changed the name to Hitachi Content Platform since this product offered more capabilities than just archiving. HCP is a secure, distributed, object –based storage system that was designed to deliver smart web-scale solutions. HCP obviated the need for a siloed approach to storing an assortment of unstructured content. With massive scale, multiple storage tiers, multi-tenancy and configurable attributes for each tenant, HCP could support a range of applications on a single HCP instance and combine the data for in depth investigations. As technology evolved, HCP added support for additional interfaces like Amazon S3 and Open Stack Swift, the latest advances in server and storage technology like VMware and erasure coding, and numerous advances in security, clustering, and data management enhancements
HCP is designed for the long term with its open interfaces and non-disruptive hardware and software upgrades to take advantage of the latest technology solutions and business trends. A customer who purchased HCAP in 2006, could have non-disruptively upgraded through multiple generations of hardware and 7 versions of HCP. More importantly, they could adopt new technologies and information management practices as new initiatives like cloud, big data, mobile, and social evolved. While HCP remains up to date and positioned for future growth, many analysts like Gartner are claiming that Centera is obsolete and are reccomending compliance archiving alternatives to Centera after the Dell acquisition of EMC. With an estimated 600 PB of data on Centera, migration will be a major problem.
HCP is at the core of Hitachi's object storage strategy, and Hitachi Data Systems is unique in the way that it has expanded its object storage portfolio around HCP.
- Hitachi Data Ingestor (HDI) is a cloud storage gateway that enables remote and branch offices to be up and running in minutes with a low cost, easy to implement, file serving solution for both enterprise and service providers.
- Hitachi Content Platform Anywhere (HCP Anywhere) a secure enterprise file synch-and share solution, which enables a more productive workforce with secure access and sharing across mobile devices, tablets, and browsers.
- Hitachi Content Intelligence (HCI) connects and indexes data that resides on HCP, HCP Anywhere, HDI, and cloud repositories to automate the extraction classification, and categorization of data.
Initially in 2006, HCP used DAS for small and intermediate configurations and SAN attached storage for large enterprise configurations. Today, HCP configurations include low cost, erasure coded HCP S10 and S30 network attached storage nodes as well as public cloud, enabling hundreds of PB of object storage all under one control without the need for a SAN. HCP server nodes have been consolidated to one model, the HCP G10 and HCP can also run on a VM. By separating logical services from physical infrastructure, HCP allows both to scale independently, while continuing to utilize existing assets.
HCP’s track record has proven that it can support your long term and changing requirements for archive, compliance and analytics. You can be sure that there will be a version 8 of HCP as it evolves to leverage new technologies and information management practices. You can also be sure that version 8 and the integrated portfolio of HDI, HCP Anywhere and HCI will continue with non-disruptive upgrades.