Hu Yoshida

Controlling your explosion of copies may be your biggest opportunity to reduce costs

Blog Post created by Hu Yoshida Employee on Nov 7, 2014

Feb 6, 2014


Laura DuBois of IDC has been calling attention to the explosion of data copies problem since 2012 and I have blogged about this several times, including  my top ten trends for 2013 and 2014. Her latest report on January 31 of this year reconfirms that 65% of external storage systems capacity is used to store non-primary data such as snapshots, clones, replicas, archives, and backup data.

In this report she talks about the need for a copy data management solution that provides storage virtualization, storage efficiency, and application-aware data management services that eliminate the need to procure separate infrastructure for each of these use cases. She recommends that an investment in a copy data management solution can significantly reduce OPEX and CAPEX across software, compute, and storage infrastructure. One of the vendors she mentions is Hitachi Data Systems.

Copies are a necessary part of doing business. However, we can do a lot to eliminate unnecessary copies, reduce the capacity of the copies, and manage the life cycle of copies to eliminate the waste of orphan copies and the risk of rouge copies. HDS has a portfolio of solutions to reduce the number of copies that applications require and manage their life cycle to reduce costs.

Storage Virtualization

Storage virtualization reduces copies in several ways. First it eliminates application silos and enables applications to access a common pool of storage resources. You do not have to extract a complete volume copy from one storage silo and export it to another storage silo to be used by another application like development/test or backup. You can also create a virtual copy, where updates are made to a common, virtual, pool of pages. All you have to do is point the application to this virtual volume and eliminate the need to create a clone or point in time snapshot.  Hitachi Storage systems can create 1024 virtual copies saving physical capacity costs as well as operational costs associated with the creation and movement of copies. 

Other storage virtualization features include automated tiering to reduce the cost of storing data on tier 1 storage when its activity or value does not require tier 1 performance or recoverability, and thin provisioning to eliminate copies of unused capacity

Storage Efficiency

The most useful tool for storage efficiency is content-based archive using Hitachi’s Content Platform, HCP.  Since the vast majority of data is rarely accessed after its initial creation and processing it makes sense to archive it once and replicate it once for data protection instead of making backup copies of the same unchanging data over and over again. HCP is a ‘content aware’ repository, which means that you can query based on the content rather than having to do a tedious and expensive directory crawl. HCP also reduces capacity requirements through single instance store and compression. Data is also hashed for immutability (proof that the data has not changed after it was ingested) and encrypted for privacy. Since backup or data protection is one of the biggest contributors to the proliferation of copies, a content archive like HCP can greatly reduce TCO.

Another way to eliminate copies and make our storage more efficient is to use links when we are sharing data rather than attachments. At HDS we use HCP Anywhere to share information via links to data that is stored in HCP. HCP Anywhere provides a secure way to share data with anyone, anywhere, on any device.

Copy Data Management

Hitachi Data Instance Manager is our primary tool for copy data management.  HDIM reduces copy data, or data instances, with a simple, easy to use policy management and workflow solution. It can eliminate backup windows and accelerate recovery with tools such as continuous data protection, archive and replication and meet stringent SLA requirements such as RPOs, RTOs.

Other related copy data management solutions include:

                  Application Protector

                  Replication Manager

                  Backup Services Manager

                  Data Protection Suite

                                  Dynamic Replicator


The need for copies will not go away, but we can do a lot to eliminate unnecessary copies, reduce the capacity of copies, and reduce the operational cost and risk of copies with virtualization, efficiency, and management tools that we have today.

Laura DuBois closes her recent study by saying: “While a firm won’t eliminate redundant hardware and software spending all at once – over time investments in dedicated silos of infrastructure for backup, archive, business continuity, test/dev, can be eliminated.  The economics of this should not be overlooked.”

David Merill, has posted many times on how to quantify the cost of copies and the economics of many of these solutions that I mention. Controlling the cost of data copies may be your best opportunity to reduce costs. Here is a short list of some of David’s posts from last year:

We Cannot Afford to Protect, Certify, and Encrypt All of the Data That Current Traditions Expect

Defining Costs for Storage Tiers

Storage Virtualization to Reduce Cost of Copies

Are We Over-protected/Over-insured?

Calculating the Total Cost of your Data Protection

A Short Primer on Annualized Loss Expectancy (ALE)

Options to Reduce the Total Cost of Data Protection