If you are going to commit your unstructured data to long term storage and want to ensure that it will be available as your data scales to petabytes across generations of technology, the choice of storage will need to be an object storage system that can scale beyond the limitations of hierarchical file systems, does not require backup, and provides rich metadata features that enable search, analytics, and transparency for compliance. Since your object storage is a long term commitment the vendor you choose and the architecture that you implement must be there for the long haul. Availability must be a key requirement.
A Safe Approach to Object Storage -HCP
Many object storage vendors support the S3 API and store objects into the Amazon cloud. Unfortunately, last week, the Amazon S3 service was down on Feb. 28 from about 9:37 am PST until 1:54 pm PST. This disrupted a major part of the internet including IoT devices like connected light bulbs and Thermostats. If your Objects storage depended on Amazon S3 cloud you were affected. Cloud services from major vendors like Amazon and Azure generally provide better availability than what most organization can provide with their own resources. However, the risk of a major outage is still there. Therefore, organizations need to provide for recovery options like another copy of the data on private cloud. Don’t trust everything to third party vendors. Bad things can happen whether you use public or private clouds. You still own the responsibility no matter where your data resides. HCP enables copies of data to be stored in geographically separate locations or in private and public clouds.
A SAIN Approach to Object Storage - HCP
I was reading a blog post from another Object storage vendor who was making a point about how changes in traditional object storage topology triggers a rebalance mechanism to reflect the new address scheme and satisfy the protection policy. With traditional object storage, storage is included in the server, so when you need to increase storage or increase compute, you need to add both whether or not you need the other. Once a node is added to the system, data may need to be redistributed across the nodes for redundancy and performance. The problem is how to reduce the impact of this rebalancing and make it transparent to the SLA of the platform. The choices seem to be, no rebalance or rebalance immediately and suffer the hit to the SLA.
With HCP we have several options. The first option is SAIN, which stands for SAN Attached Independent Nodes, where, HCP G or HCP VM Access Nodes are attached via a SAN to an external storage system like the Hitachi VSP. SAIN allows the storage and compute resources to be scaled separately and data protection is provided by the VSP RAID controller.
Scale transparently from small to Large
In addition to adding SAIN storage, adding additional compute and storage can be done in multiple layers with little or no impact on SLA. HCP Access Nodes can be configured over an IP network as a Redundant Array of Independent nodes (RAIN) with no SAN attached storage. This is similar to traditional object storage where storage is directly attached to the servers and data is shared across an internal IP network. Data protection is provided by copying data across the nodes. HCP Access Nodes also supports S3 access to HCP S storage nodes and cloud storage over a storage IP network. Our current HCP Access nodes come as an HCP G appliance or as an HCP VM, which is HCP Access Node software that runs on a VM)
An initial configuration can start with 3 Access Nodes using RAIN.
HCP can scale to a very large configuration including RAIN, SAIN, HCP S, and public cloud.
The S10 and S30 are HCP S storage nodes with low cost erasure coded JBOD disks that are comparable to the cost of public cloud storage but sits in a private cloud and is accessed by the HCP Access nodes over a storage IP network. (S10 supports up to 192 TB in 4RU, the S30 supports up to 9.4 PB in 68RU, each with 10TB JBOD disks) Public cloud is also accessed over this storage network.
(1) - Adding HCP G or HCP VM Access Nodes. The newly added servers are immediately available and usable to improve performance, and the user has complete control over the rebalance. A user can configure the intensity and schedule of the rebalancing service. For example, they can schedule the rebalance to run automatically at high-intensity only during off-hours (to avoid impacting on-hours SLAs). This control gives you the best of both worlds – immediate performance increase as well as a tunable rebalance.
(2) – Adding storage capacity to an HCP G or HCP VM Access node is seamless. Since an HCP system provides a unified namespace, capacity in the background is added in the form of LUNs or Volumes, but this is invisible to the end-use or application. This is the same whether it is internal or SAN attached capacity. The balancing of data between old and new storage volumes is also tunable, the same as (1) above.
(3) – HCP S can be transparently added to an HCP system, effectively increasing the total storage capacity of the system without any disruption. S nodes added to an HCP system are immediately available for use and there is no rebalance of data between S nodes necessary. Therefore, adding S nodes to an HCP storage pool simply increases the possible bandwidth/performance for incoming data.
(4) – HCP S nodes can have their capacity expanded. An S node distributes data across all available JBOD disks using erasure-coding. When disks are added or removed from an S node, internal services automatically optimize data placement – transparently – to allow for the greatest availability, protection and performance.
(5) – HCP capacity can be expanded over S3 to public cloud. Cloud capacity can be added transparently through S3 connection to public cloud.
HCP can support your unstructured data need in public or private clouds with the assurance of safety, availability, and scalability. Using HCP’s adaptive cloud tiering (ACT) functionality, you can manage a single storage pool using any combination of server disks, Ethernet attached HCP S nodes, SAN disks, NFS or a choice of one or more public cloud services, including Amazon S3, Google Cloud Storage, Microsoft Azure®, Verizon Cloud, Hitachi Cloud Service for Content Archiving, or any other S3 enabled cloud service
For more detailed information, please see the HCP architecture white paper at this link