Hu Yoshida

Email Investigations Made Easier with HCP/HCI

Blog Post created by Hu Yoshida Employee on Feb 26, 2017



Emails are the primary communications tool for enterprise business. We exchange documents and make agreements through email. Emails contain private information of people and companies.  Almost everything we do is documented in email and leaves an electronic trail. Therefore, e-mail has become a dominant area of focus for all corporate litigation. Email investigations for compliance provides the evidence to prove or disprove a violation of regulations, whether it is an accusation of fraud or misuse of personal or corporate information.  Not only can it prove what occurred but identify who was involved. Ongoing audits and monitoring of emails can ensure compliance and protection of information.


Due to the high volume of email, investigations could be cost-prohibitive and time consuming. Businesses must bear the cost of the discovery process and must make all reasonable effort to retrieve relevant documents.  One firm had to recover messages over a 3 ½ year time frame from tapes across 40 e-mail servers at a substantial cost of time and effort. The cost of eDiscovery can run into the tens of thousands of dollars especially if you have to load and scan generations of backup tapes or years of archive storage. The high profile investigation of Hilary Clinton’s private email server over a 7 year period is estimated to have cost $20 million according to the American Thinker.


A corporate policy that email is deleted after “x” number of days does not protect the company from having to run through a costly discovery process.  Even though they can claim that they don’t have the e-mail since it was deleted from the server does not mean that an end user does not have a copy of it in a local PST file on their laptop.  The current view for meeting regulatory compliance is to retain all e-mail in an indexed format for ease of search and retrieval.  Fines are also increasing for non-compliance with defined regulations. A recent example of increasing fines is the General Data Protection Regulation which will require fines as much as $20 million Euros or 4% of an organizations worldwide turnover, whichever is higher.


According to David Karas, our Chief Ethics and Compliance Officer for HDS, our cost for email investigation is much lower due to the use of our Hitachi Content Platform (HCP) and our Hitachi Data Discovery Suite (HDDS). HDS does not use HCP to archive email. HCP is used to ingest, tag and store the email journal. Exchange has a journaling feature where it forwards a copy of each and every email to the journal (HCP via SMTP) where it can be stored forever.  That’s different than an archive, which is more about saving space by keeping newer emails on more active storage and stubbing or tiering older emails to less costly storage.  These emails can still be deleted.  A journal is really meant to keep all email and do so as a separate copy so you have a complete record. This eliminates the need to scan generations of backup tapes from multiple servers or scan PST files. Since we use HCP Anywhere, we also include emails and attachments on mobile devices.


Using HDDS, David can structure a search and have it execute in less than a second then refine it and run multiple searches in a matter of minutes. Often he can decide after a few runs whether the claim he is investigating has any merit, if not he can close the claim and save the cost of further investigation. David is not a data scientist, he is a lawyer with deep experience and expertise in ethics and compliance behaviors. He can open up his computer anywhere in the world and do his own investigation. He does not need to send a request to IT to search for emails between Hu Yoshida and Britney Spears, which could take several days and run the risk of information leakage which might be embarrassing to Hu and Britney. (this is a hypothetical example)


HDDS was our first generation search tool, which is being replaced by our Hitachi Content Intelligence product.  HCI will add additional intelligence to our ability to investigate HCP repositories like ones where the Exchange journal is ingested. At HDS we are currently in the process of converting to Office 365 and replacing HDDS with HCI.


Office 365 is cloud based but it still provides a journal which can be ingested by HCP where we can search it just as we have in the past, but with a newer search tool HCI. In fact, this is preferred by Office 365. The journal could be sent back into Office 365 to an internal journaling mailbox, but this will chew up a lot of cloud storage capacity which will increase the cost of Office 365. The Office 365 archive, is used for reducing cloud storage capacity but searches take 45 min or more since the archive is not optimized for search as it is in HCP. Search in Office 365 is optimized to the lowest common denominator and therefore takes much longer.


A data scientist, a storage specialist, and a lawyer walk into a bar….


You’ve heard this one before. This is the setup for a joke where three individuals view a situation from different perspectives. The punch line comes from the view of the last individual. The data scientist will interpret and represent data mathematically; the storage specialist will focus on storing, protecting and retrieving data; and the lawyer is looking for evidence to prove or disprove that something has occurred. HCP and HCI will do the work of the data scientist and the storage specialist so that the lawyer can concentrate on what he does best.


Check with your compliance or eDiscovery officer and see what it costs for them to do an investigation of email or other data repositories. Then talk to your HDS representative about the capabilities of HCP and HCI to discover how we can facilitate the process of eDiscovery. Not all of us are lawyers who are experienced in compliance. However,  David said that he could be available to talk to your compliance officer to share best practices and experiences.