Skip navigation
1 2 3 Previous Next

Hu's Place

272 posts

NEXT 2018.png


At Hitachi Vantara’s NEXT 2018 event, this week in San Diego, we were fortunate to have Malcolm Gladwell as a keynote speaker. Malcolm Gladwell is a writer for The New Yorker and has published five books that have been on the Times bestselling list: Tipping Point, Blink, Outliers, What the Dog Saw, and David and Goliath. Malcolm’s books and articles often provide unexpected insights which helps us understand the events around us and enable us to make better decisions.  


In the general session our CEO Brian Householder did an interview with Malcolm. Some of the ideas that came out were: A model is only as good as the data that goes into it which he illustrated with the use of standardized student testing to measure the quality of teachers. Another was to focus on the core issue and not to panic when the circumstances changes. The example here was the music industry where the business shifted from recordings to  and streaming music. The music industry was panicked over the loss of revenue, but today the music industry is making more money than ever due to live performances which are promoted by their  music. I could see a similar transition in our industry where cloud was a threat to the IT vendors, but today the revenues are increasing for IT vendors due to software and services which makes it easier for their customers to develop applications and generate information. 


Later I had the opportunity to moderate a Q&A session with Malcolm and a group of VIP customers. Here we started with the paradigm of puzzles versus mysteries. While Malcolm is known for his best sellers, there are a lot of ideas that are created in his New Yorker articles that are creating even more interest today. In 2007 he wrote an article “Open Secrets “, in which he raised the paradigm of puzzles versus mysteries.


For example the whereabouts of Osama Bin Laden was a Puzzle. This was eventually solved and taken care of. Puzzles can be solved by gathering more data. There is an answer. On the other hand, what happens to Iraq after the fall of Saddam Hussein was a mystery. Mysteries require judgments and the assessment of uncertainty, and the hard part is not that we have too little information but that we have too much. A mystery is a problem caused by an excess of information--and can only be solved by trying to make sense in a more sophisticated way of what is already known. Today we live in an age of information overload and all professions must manage the transition from puzzle solving to mystery solving.


While the direction of the Q&A did not allow him to go deeper into this topic, he has talked about this in other interviews. In an interview in He said that most businesses and industries are built for solving puzzles, not mysteries. With greater access to data comes greater responsibility. Mysteries require a shift in thinking that most industries simply are not organized or prepared to handle. Complex mysteries require a different kind of thinking and analytic skills. You need to decide whether the problem you are solving is a puzzle or a mystery. If you think you are solving a puzzle and collect more and more data, the overload of data might bury the key nuggets that could help you solve the mystery. Data that is used to solve a puzzle must be looked at differently when you are solving a mystery. One of the biggest mistakes that businesses make is treating all data equally, by giving all data equal weight.


“This idea gets back to Nate Silver's The Signal and the Noise concept. Some data will tell you precisely what you need to know and help you solve the mystery you're after, but most data is noise, distracting you from the answers you seek. Today, the trick is not analyzing the data so much as understanding which data represents the signal you pay attention to versus the data that is the noise that you must ignore. We have to start ranking data in terms of its value.”


The transition from puzzles to mysteries should resonate with most CIOs in regards to their data. Data bases were great at solving puzzles, but the complex nature of today’s business is more of a mystery that requires big data and analytics. In order to be competitive, companies are shifting from data generating to data powered organizations and big data systems are becoming the center of gravity in terms of storage access and operations. Data curation is needed to understand the meaning of the data as well as the technologies that are applied to the data so that data engineers can move and transform the essential data that data consumers need. Hitachi Content Intelligence and Pentaho Data Integration are key tools for searching, classifying, curating, enriching, and analyzing the data to understand what you have and how it can be used to solve mysteries.

Don’t Blink or you’ll miss What the Dog Saw. Come to NEXT 2018, the Premier Event for the Digital Revolution, sponsored by Hitachi Vantara in beautiful San Diego, Sept. 25-27. Catch the Outliers and what may be the Tipping Point for your successful transformation with your data, your innovation.




By now you may have recognized the titles of a few of the bestselling books of Malcolm Gladwell, one of Time magazines 100 most influential people and one of the Foreign Policy’s Top Global Thinkers. Gladwell is a story teller who makes you see things in different ways. He is an assumption challenger. He has explored how ideas spread in The Tipping Point, decision making in Blink, the roots of success in Outliers, and in What the Dog Saw, he has compiled several interesting psychological and sociological topics ranging from people who are very good at what they do but are not necessarily well known to intelligence failures and the fall of Enron.


Malcolm Gladwell also authors, Revisionist History, a podcast, produced through Panoply Media. It began in 2016, and has aired 3 10-episode seasons. Revisionist History is Malcolm Gladwell's journey through the overlooked and the misunderstood. Every episode re-examines something from the past—an event, a person, an idea, even a song—and asks whether we got it right the first time. Being a veteran of the Vietnam War, I was particularly interested in Saigon, 1965, If you are interested in history you would be interested in this podcast.


Malcolm Gladwell will be addressing the topic: What Disruptors Have in Common on Wednesday morning Sept 26 at NEXT 2018. If you can’t make it to NEXT 2018 you can watch the livestream of Malcolm @Gladwell’s keynote on


Don't miss this event and a chance to hear Malcolm Gladwell's insights into the changes facing our industry!

Big data refers to the use of data sets that are so big and complex that traditional data processing infrastructure and application software are challenged to deal with them. Big data is associated with the coming of the digital age where unstructured data begins to outpace the growth of structured data. Initially the characteristics of big data was associated with volume, variety, and velocity. Later the concepts of veracity and value were added as enterprises sought to find the ROI in the capture and storing of big data. Big Data systems are becoming the center of gravity in terms of storage, access, and operations; and businesses will look to build a global data fabric that will give comprehensive access to data from many sources and computation for truly multi-tenant systems. The following chart is a composite of IDC’s 2018 Worldwide Enterprise Storage systems Market Forecast 2018-2022and Volume of big data in data center storage worldwide from 2015 to 2021. This composite shows that Big Data will eventually be more than half of the capacity of enterprise data by 2021.


Big Data Growth.png


Unfortunately, the deployment of large data hubs, data warehouses, data lakes, ERP, Salesforce, and Hadoop, has resulted in more data silos that are not easily understood, related, or shared. Deployments of large data hubs over the last 25 years (e.g., data warehouses, master data management, data lakes, Salesforce and Hadoop) have resulted in more data silos that are not easily understood, related, or shared. In the May-June issue of the Harvard Business Review,  Leandro Dalle and Thomas Davenport, published an article “ What’s your Data Strategy”, in which they claimed that less than 50% of an organization’s structured data is actively used in making decisions and less than 1% of its unstructured data is analyzed or used at all. While the ability to manage and gain value from the increasing flood of data is more critical than ever, most organizations are falling behind. I contend that much of this is not due to the lack of interest or need but is due to the difficulty of accessing the silos of data. I believe we need to add another “V” word in association with big data. That word is virtualization.


In IT there have been two opposite approaches to virtualization. Server virtualization where you make one physical server look like multiple servers, and storage virtualization where you make multiple physical storage units look like a single storage system – a pool of available storage capacity that can be managed and enhanced by a central control unit. The virtualization of big data is more like storage virtualization, where multiple data silos can be managed and accessed as one pool of data. The virtualization of data is done through the use of meta data, which enables diverse data to be stored and managed as objects.


Object storage can handle the volume challenge. It is essentially boundless since it is a flat file and is not bound by directories. Hitachi’s HCP (Hitachi Content Platform) can scale the volume of data across internal and external storage systems, and from edge to cloud. HCP meta data management enables it to store a variety of unstructured data. Hitachi Vantara’s object store was designed for immutability and compliance and has added Hitachi Content Intelligenceto ensure the veracityof data that it stores.  HCP with Hitachi Content Intelligence provides a fully federated content and metadata search across all data assets, with the ability to classify, curate, enrich and analyze the data you have. Hitachi Vantara’s Hitachi Content Platform and Hitachi Content Intelligence provide the intelligent generation and curation of meta data to break down those silos of unstructured data. Pentaho with its Pentaho Data Integration (PDI) provides a similar capability to breakdown the silos of structured data.  In this way we can virtualize the silos of data for easier access and analysis and transform these separate data hubs into a data fabric.


Earlier, we mentioned that Velocity was one of the attributes of big data. That referred to the velocity at which unstructured data was being generated. It did not refer to the access speed of object storage systems. Since object storage systems use internet protocol, it cannot process data as fast as directly attached file or block systems. As a result, large analytic systems like Hadoop would ETL the data into a file system for processing. For years Hadoop has been the go to data storage and processing platform for large enterprises. As Hadoop has solved this problem of storing and processing massive amounts of data, it has created a new problem. Storing data in Hadoop is expensive and fault tolerance comes from 3x data redundancy. Storing this data indefinitely is expensive but this data is still valuable, so customers do not want to throw it away. HCP with its lower cost object storage options and 2x data redundancy, can solve the problem of storing and protecting massive amounts of data. But what about the lack of processing speed? This is where virtualization can also help.


Hitachi Content Platform (HCP) has partnered with Alluxio to utilize their memory-speed virtual distributed file system to deliver a certified solution that simplifies the challenges of connecting big data applications, like Hadoop, to Hitachi Content Platform to reduce storage costs, and provide high-performance and simplified access to data. Alluxio lies between compute and storage and is Hadoop and object storage compatible. Alluxio intelligently caches only required blocks of data, not the entire file which provides fast local access to frequently used data without maintaining a permanent copy of data. Existing data analytics applications, such as Hive, HBASE, and Spark SQL, can run on Alluxio without any code change and store data to HCP and access data with high performance memory-speed access


This is similar to the what HCP provides as a virtual bottomless file system for NFS and SMB filers, where files are accessed through a local Hitachi Data Ingestor which acts as a caching device to provide remote users and applications with seemingly endless storage. As the local file storage fills up, older data is stubbed out to an HCP system over S3. Cloud storage gateways like HDI, HCP Anywhere edge file servers or 3rd party tools’, act as a cloud gateway connecting remote sites to the data center without application restructuring or changes in the way users get their data. This eases the migration from traditional NAS-to-cloud-based file services.


Now is the time to add Virtualization to the Big Data “Vs” of Volume, Variety, Velocity, Veracity, Value.


Amazon’s cloud business grew nearly 48.9$ in the second quarter of 2018 generating $6.11 billion in revenue which contributed 11.5% to Amazon’s total revenues in that period. Their cloud business also was one of the main contributors to Amazon’s record $2.5 billion in profit. This is a strong sign that the cloud business is growing and adding value to organizations.


In the early days of cloud there were concerns about trusting your data to the cloud. There were concerns about security, reliability and availability which are core to any IT operation. There were also disasters where cloud companies like Nirvanix, one of the pioneers of cloud-based storage went out of business and gave customers only two weeks to retrieve and remove their data. Several telco’s started, then withdrew from the cloud provider business. Amazon’s cloud business was introduced in 2002 and did not turn a profit until 2015. However, cloud technologies and storage technologies have moved us past those concerns and enterprises are moving more of their data to the cloud. In 1Q 2018, IDC reported that the revenues generated by original design manufacturers (ODM) who sell storage directly to cloud hyperscale data centers increased 80.4% year over year to $3.1 billion and was 23.9% of total enterprise storage investments during that quarter.


One of the cloud technologies that has increased trust in the cloud is S3. S3 (Simple Storage Services) is a cloud computing web service which provides object storage through web services interfaces like REST using HTTP commands that enable data to travel across the internet. The S3 API provides the capability to store, retrieve, list and delete objects in S3. S3 services include features that have increased the trust and usability of cloud storage such as multi-tenancy, security& policy, atomic updates, life cycle management, logging, notifications, replication, encryption, and billing. S3 was introduced by Amazon in 2007 and has been adopted by most major cloud providers.


The major storage technology that increased trust in the cloud is object storage with an S3 interface. Hitachi’s object storage, Hitachi Content Platform (HCP) enables data to be tiered to the cloud over S3. Tiering enables an enterprise’s data center to move data that is less active to lower cost storage capacity in the cloud while maintaining control of the object storage from the data center. HCP always maintains at least two copies of the data objects and eliminates the need for backups. The copies can be dispersed over a geographic area with geo distributed erasure coding for availability. HCP also provides the flexibility to migrate data across the different cloud platforms or, in the event of another Nirvanix type failure, continue operations while recovering the data.  Migrating between clouds involves setting up a new account in another cloud and crypto shredding the data in the old. Operations continues on the second copy and no movement has to be done between the clouds, which would be very expensive since cloud storage involves access charges.


Since the S3 interface is being broadly adopted by cloud providers, it is relatively easy to interface HCP to most clouds. We have connected to many of the major cloud venders in our Center of Excellence. Jeff Lundberg posted a blog on how we successful tested data tiering from HCP to the Alibaba Cloud (Aliyun). This connection to Aliyun is exciting for us since Aliyun is the fastest growing cloud provider.


While Aliyun is only about one tenth the size of Amazon and is not yet profitable, it is growing faster than Amazon. According to a recent Synergy research report Ali Baba surpassed IBM for the 4th ranking in worldwide cloud infrastructure, behind Microsoft Azure, Google Cloud, and of course Amazon AWS. Ali Baba is the largest cloud computing company in China,[1] and operates in 18 data center regions and 42 availability zones around the globe. One is located near us in San Mateo, California. As of June 2017, Alibaba Cloud is placed in the Visionaries' quadrant of Gartner's Magic Quadrant for Cloud Infrastructure as a Service, Worldwide.


HCP’s unique object storage service plus S3 tiering provides organizations with a secure, efficient and cost-effective way to leverage cloud services from multiple cloud vendors while retaining control within their private data centers.

In the last few days this week we have seen a massive sell off in crypto currencies. Bitcoin lost $1000 in 24 hours. Other currencies like ethereum, ripple, bitcoin cash and EOS, all dropped by around 20 per cent. As of this Thursday, one Bitcoin is worth $6,486.45 USD. This is down from nearly $20,000 in December 2017.


Bitcoin meltdown.png

One reason for the sell-off is attributed to a report that Goldman Sachs is dropping plans for a cryptocurrency trading desk. Another reason being given is the introduction of trading in bitcoin futures in early 2018, where we have seen massive market manipulation and suppression by experienced futures traders. There has also been an increase in crypto mining, the stealing of electricity and CPU cycles for blockchain mining, and crypto jacking which is essentially ransomware for bitcoins. A college student allegedly stole bitcoins by SIM swapping of mobile phones and stealing bitcoins through the mobile phone numbers of bitcoin investors. One of the crypto entrepreneurs that he hacked had just crowd funded $1million for an ICO. He was caught when his high spending life style drew attention and he was tracked through blockchain.


If terms like blockchain mining and ICO are new to you, you need to attend David Pinski's Blockchain session at NEXT 2018, September 25 to 27 in San Diego. David is the Chief Strategist for Financial Innovation and Head of the Hitachi Financial Innovation Laboratory. David’s breakout session is titled: Blockchain: Build Digital Trust in Your Enterprise. It is scheduled for Thursday 9/27 from 5:00 to 5:30.


David has a banking and technology background covering architecture, core banking systems, payments, strategy and venture with major banks such as Capital One, ING Direct and Bank of America. He has startup experience in areas of fraud, identity, loan origination with companies such as Zumigo, Silver Tail Systems, Octane Lending. And has been issued four patents in the area of fraud mitigation, payment networks and banking systems with a dozen more pending.


Hitachi's NEXT 2018 is the premier event for the digital revolution. It's for business leaders who embrace the digital economy and base decisions on data to ensure success. ... Come to NEXT 2018 to see what's next for your data center, data governance, insights and innovation, from Hitachi and our many partners. Register at NEXT 2018

Enterprise Storage Forum recently published their 2018 Storage Trends surveywhich made some interesting observations. The biggest challenges for IT and business leaders with operating their current storage infrastructure were aging equipment and lack of storage capacity.



While much of the focus today is on high performance flash and NVMe, the critical concerns are still around aging gear, lack of capacity, high cost of operations, security, high maintenance costs, and silos before poor performance issues.


When one considers the data explosion being accelerated by Big Data, IoT, the increasing use of meta data, and AI/Machine learning, it is not surprising that storage capacity should be our greatest concern. You’ve heard analysts say that data is the new oil that fuels digital transformation. However, unlike oil that is consumed once, data will be consumed over and over again as long as we can store it. Machine learning gets more intelligent as it consumes more data, and that data can be reused for new learning models and analysis. In that sense data is more like gold in that it retains and may even increase in value. But unlike gold, data can be replicated many times over for protection, compliance, accessibility and many other requirements, driving the need for more and more storage.


According to a survey of organizations involved in Big Data, AI and IoT applications, conducted by NVMe analyst firm G2M, “Over 79% of respondents believe that current processing/storage architectures will not be able to handle the amount of data in their industry in the next 5 years.”


Trendfocus reports that 2Q 2018 HDD storage capacitygrew to 214.60 Exabytes, up 3%, breaking the previous record of 204.07 EB shipped in 1Q 2018. This growth was driven mainly by the nearline HDD market with sales reaching 14.15 million and totaling 104.35 EB in 2Q. Most of nearline capacity growth was driven by enterprise systems where 43% of nearline units had capacities of 10TB or higher. Total SSD capacity shipped in 2Q 2018 was an increase of 5% over 1Q and is estimated to be about 20 EB. SSD capacities are increasing with 3D NAND (capacity is increased by stacking the memory cells vertically in multiple layers), and the capacity price of an SSD is now almost equivalent to enterprise HDD at about $.25 to $.27 per GB.  Nearline HDDs are still much lower at $.02 and $.03 per GB which explains their growth in the storage market.  Most applications don’t require the higher performance of an SSD so a hybrid array with a mix of SSD, HDD, storage virtualization, and automated tiering to the cloud could be the best solution for capacity and aging infrastructure concerns.


Two of the key takeaways according to this 2018 storage trends survey referenced above are:


“Performance and cost drivers run neck-and-neck.
Balancing performance and cost aren't exactly new to storage administrators. What is new is the extreme storage performance that is now available with flash and flash accelerating technologies like NVMe. They are costlier than hybrid systems or lower cost AFAs, so it's important for IT to conduct detailed cost analyses for high performance storage purchases.


Flash adoption is steady but not roaring ahead of HDD.
HDD's installed base is massive and works well with all but the fastest high-performance applications. It makes no sense for IT to rip and replace disk-based or hybrid systems - and may not for years to come. Flash/SSD is still the high-performance choice: important for transactional systems but not critical for most business applications.”



The net today is that capacity is more of a concern than performance and hybrid arrays with a mix of SSD for performance and HDD for capacity with cloud connectivity may be the best option for IT and business leaders who want to optimize their storage infrastructure.


For more information on what Hitachi Vantara has in store to to address your data and storage requirements, join us at NEXT 2018, September 25 to 27 in beautiful San Diego. The theme is" Your Data. Your Innovation."

The hot topic in storage today is NVMe, an open standards protocol for digital communications between servers and non-volatile memory storage. It replaces the SCSI protocol that was designed and implemented for mechanical hard drives which processed one command at a time. NVMe was designed for flash and other non-volatile storage devices that may be in our future. The command set is leaner, and it supports a nearly unlimited queue depth that takes advantage of the parallel nature of flash drives (a max 64K queue depth for up to 64K separate queues). The NVMe protocol is designed to transport signals over a PCIe bus and removes the need for an I/O controller between the server CPU and the flash drives.


PCIe (Peripheral Component Interconnect Express) is a standard type of connection for internal devices in a computer. NVMe SSD devices connected to PCIe has been available in PCs for some time. Hitachi Vantara has implemented NVMe on our hyperconverged, Unified Compute Platform (UCP HC), where internal NVMe flash drives are connected directly to the servers through PCIe. The benefit of this is having a software-defined storage element and virtual server hypervisor on the same system where you can access NVMe at high speed.  It makes sense to us to first bring the performance advantages of NVMe to commodity storage like our UCP HC because improvements will be greater for our customers. There are considerations though: since there is not a separate storage controller, data services will have to be done by the host CPU which adds overhead. If a VM has to access another node to find data, you lose time. For smaller data sets this isn't an issue, but as the workload increases, this negates some of the performance advantages of NVMe. However, you are still ahead of the game compared to SCSI devices and UCP HC with NVMe is a great option for hyperconverged infrastructure workloads.


NVMe is definitely the future, but the storage industry is not quite there yet with products that can fully take advantage of the technology. PCIe has not broken out of the individual computer enclosure to function as a high-speed, wide bandwidth, scalable serial interconnect of several meters in length between control platforms and I/O, data storage or other boxes within an I/O rack. Here are the current proposals for NVMe transport:

NVME Transport.png

Clearly this is an evolving area and most storage solutions that are available with NVMe today use PCIe for the back-end transport.  NVMe SSDs plug into a PCIe backplane or switch which plugs directly into the PCI root complex. However, PCIe has limited scalability.  There’s a relatively low number of flash devices that can reside on the bus.  This is okay for hyperconverged storage but it’s not what most customers are used to dealing with in All Flash Arrays. Scalable enterprise NVMe storage arrays will likely require a fabric on the backend.


What one would like is an NVMe, All Flash Array, with an enterprise controller for data services and shared connectivity over a high-speed fabric. The backend Flash devices could connect to the controller or data services engine over an internal  NVMe-oF, which would in turn connect to a host system using an external NVMe-oF using FC or RDMA. Since PCIe connections are limited in distance and do not handle switching, a fabric layer is required for host connectivity to external storage systems. While NVMe standards are available, both FC-NVMe and NMVe-oF are still works in progress since Rev1 of the NVMe-oF standards was published in June of 2016 by the NVM Express organization, and Rev 1 of the FC-NVMe standard was just released last summer by the T11 committee of INCITs. Only a few proprietary implementations are available today. Several of our competitors have announced AFAs that they claim to be NVMe “fabric ready”.  In fact, they are promoting features that have not been tested for performance, resiliency or scalability and are based on incomplete standards that are still evolving.  Implementing based on these promises can add a huge risk to your installation and tie you to a platform that may never deliver up to the hype.


Here is where I believe we are in the NVMe introduction of enterprise storage arrays.

Enterprise NVME Arrays.png

NVMe-oF is needed to scale the connectivity and speed up the transmission of data between an NVMe SSD device and controller and FC-NVMe or NVMe-oF can do the same between the controller and a fabric connected host. However, there is a lot that goes on in the SSD device, the controller, the fabric, and the host that can affect the overall throughput. In fact, the congestion caused by the higher speeds of NVMe and the higher queue depths can negate the transmission speeds unless the entire system is designed for NVMe.


On the backend, flash drives require a lot of software and a lot of processing power for mapping pages to blocks, wear leveling, extended ECC, data refresh, housekeeping, and other management tasks which can limit performance, degrade durability, and limit the capacity of the flash device. The higher I/O rates of NVMe could create bottlenecks in these software functions, especially on writes. While Hitachi Vantara Flash storage systems can use standard SAS SSDs, we also provide our own flash modules, the FMD, FMD DC2 (with inline compression), and the FMD HD for high capacity (14TB) to improve the performance, durability and capacity of NAND devices. In order to support these processing requirements, the FMDs from Hitachi Vantara are built with a quad core multiprocessor, with 8 lanes of PCIe on the FMD PCBA and integrated flash controller logic, which supports 32 paths to the flash array. Having direct access to the engineering resources of Hitachi Ltd., Hitachi Vantara is able to deliver patented new technology in our FMDs, which sets it apart from competitive flash vendors. As the NVMe rollout progresses, expect to see other vendors trying to catch up with us by putting more processing power into their flash modules.  This advantage that Hitachi has from our years of flash hardware engineering efforts is one of the reasons why we aren’t rushing NVMe into our Virtual Storage Platform (VSP) all-flash arrays.  Our customers are already seeing best-in-class performance levels today.


One of the biggest reasons for a controller or data services engine is to be able to have a pool of storage that can be shared over a fabric by multiple hosts. This enables hosts and storage to be scaled separately for operational simplicity, storage efficiency and lower costs. Controllers also offload a lot of enterprise functions that are needed for availability, disaster recovery, clones, copies, dedupe, compression, etc. Because of their central role, controllers have to be designed for high availability and scalability to avoid being the bottleneck in the system. Dedupe and Compression are key requirements for reducing the cost of flash and are done in the controller if both are required (note that compression is done in the FMD when they are installed but in the controllers for SSDs). The new controllers for an NVMe controller must support all these functions while talking NVMe to the backend flash devices and FC-NVMe or NVMe-oF across the fabric to the multiple hosts. Here again, the increase in workloads due to NVMe could create bottlenecks in the controller functions unless it’s been designed to handle it.


Over the many generations of VSP and the VSP controller software, SVOS; Hitachi has been optimizing the hardware and software for the higher performance and throughput of flash devices. The latest version of Storage Virtualization Operating System RF (SVOS RF) was designed specifically to combine QoS with a flash aware I/O stack to eliminate I/O latency and processing overhead. WWN, ports, and LUN level QoS, provide throughput and transaction controls to eliminate the cascading effects of noisy neighbors which is crucial when multiple NVMe hosts are vying for storage resources. For low latency flash, the SVOS RF priority handling feature bypasses cache staging and avoids cache slot allocation latency for 3x read throughput and 65% lower response time. We have also increased compute efficiency, enabling us to deliver up to 71% more IOPS per core. This is important today and in the future because it allows us to free up CPU resources for other purposes, like high speed media. Dedupe and compression overheads have been greatly reduced by SVOS RF (allows us to run up to 240% faster while data reduction is active) and hardware assist features. Adaptive Data Reduction (ADR) with artificial Intelligence (AI) can detect, in real time, sequential data streams, data migration or copy requests that can more effectively be handled inline. Alternatively, random data writes to cells that are undergoing frequent changes will be handled in a post-process manner to avoid thrashing on the controllers. Without getting into too much technical detail, suffice it to say that the controller has a lot to do with overall performance and more will be required when NVMe is implemented.  The good news is that we’ve done a lot of the necessary design work within SVOS to optimize the data services engine in VSP for NVMe workloads.


From a fabric standpoint FC-NVMe can operate over a FC Fabric, so data centers could potentially use the technology they have in place by upgrading the firmware in their switches. The host bus adapters (HBA) would need to be replaced or upgraded with new drivers and the switches and HBAs could be upgraded to 36 Gbps to get the performance promised by NVMe. If NVMe-oF is desired, it will require RDMA implementations which means Infiniband, iWARP or RDMA over Converged Ethernet (RoCE).  Vendors, such as Mellanox, offer adaptor cards capable of speeds as much as 200 Gbps for both Infiniband and Ethernet. Considerations need to be given for the faster speeds, higher queue depths, LUN masking, and QoS, etc, otherwise congestion in the fabrics will degrade performance.  More information about NVMe over fabric can be found in blogs by our partners Broadcom/Brocade and Cisco. J Metz of Cisco published a recent tutorial on Fabrics for SNIA.


Another consideration will be whether the current applications can keep up with the volume of I/O. When programs knew they were talking to disk storage, they could branch out and do something else while the data was accessed and transferred into its memory space. Now it may be better to just wait for the data rather than go through the overhead of branching out, waiting for interrupts and branching back.


NVMe is definitely in our future. However, moving to NVMe will take careful planning on the part of the vendors and consumers. You don’t want to jump at the first implementation and find out later that you have painted yourself into a corner.  Although the Hitachi Vantara VSP F series of all flash arrays do not support NVMe at this time, it compares very favorably with products which have introduced NVMe.


A recent, August 6, 2018, Gartner Critical Capabilities for Solid State Arrays report provides some answers. In terms of performance rating, the VSP F series came in third in front of several vendors that had NVMe. This evaluation did not include the latest SVOS RF and VSP F900/F700/F370/F350 enhancements which were announced in May because they did not make Gartner’s cutoff date for this year’s evaluation. These new enhancements featured an improved overall flash design, with 3x more IOPS, lower latency and 2.5x more capacity than previous VSP all flash systems.


The only two vendors ahead of the F series in performance are the Kaminario K2 and the Pure Storage Flash Blade, none of which have the high reliability, scalability and enterprise data services of the VSP.  In fact, the VSP F series placed the highest in RAS (reliability, availability, serviceability) of all 18 products that were evaluated. The Kaminario K2 has a proprietary NVMe-oF host connection which they call NVMeF,and the Pure Storage NVMe arrays followed us with a DirectFlash storage module instead of the standard SSD. One can assume that the performance of the Hitachi Vantara All Flash Arrays would be higher if the new models of the VSP and SVOS RF had been included in the evaluation. Here are the Product scores for the High-Performance Use Case for the top three places on a scale from 1 to 5 with 5 being the highest.


Kaminario K2                                       4.13

Pure Storage FlashBlade                    4.08

Hitachi VSP F Series                            4.03

Pure Storage M and X Series              4.03


Hitachi product management leadership has confirmed that our VSP core storage will include NVMe in 2019, and we are happy to share more roadmap details to interested customers on an NDA basis. In the meantime I recommend that you follow the NVMe blog posts by Mark Adamsand Nathan Moffit

Various world organizations and financial institutions have sought to develop a digitization index to evaluate a country’s ability to provide the necessary environment for business to succeed in an increasingly digitized global economy. The following study was done in 2016 and is interesting since it shows a strong correlation between digitization and GDP per capita. The most advanced digitization countries have a higher GDP per capita.

Digitization chart.png

Where a country falls on the digitization index depends on how this index is defined. However, there have been several studies done using different digitization indices and in most cases the ranking of the countries were very similar. BBVA has developed a working paper to define a Digitization Index (DiGiX) which is widely accepted. This is a comparative index which rates the countries from zero to one. The BBVA study shows Luxemburg with the highest rating of 1.0. Although the GDP chart above uses a composite index, the digitization index on the x-axis ranks the countries in a similar manner to the BBVA DiGiX. The Y-axis is an assessment of GDP per capita without the impact of oil revenues. This may not be valid since digitization could influence oil revenues in a similar way that it impacts all revenues.


The following slide shows the types of indicators that were considered in BBVA’s DiGiX.

Digital Index.png


One anomaly on this chart is South Korea. South Korea is one the most technologically advanced countries, with technology leaders like Samsung. LG, and Hyundai. Korea Telecom dazzled everyone at the Winter Olympics with its display of 5G networks. 92.4% of the population use the internet and South Korea has consistently ranked first in the UN ICT Development Index. However, the GDP for South Korea, is below that of countries that are much lower on the digitization index. This shows that digitization is not the only factor in driving GDP.


Korean workers.png

The South Korean worker is the hardest worker in the world, logging some 300 work hours more per year than workers in most other countries, and yet productivity and prosperity is lagging behind less digitalized countries. One of the reasons given is that their retirement systems is not adequate to support retired workers and many of them have to find low paying jobs after retirement. Up until recently the retirement age was 55 and young workers had to look forward to supporting their parents and grandparents. According to the New York Times sky-rocketing household debt, high youth unemployment and stagnant wages are hobbling the economy. Young people have to scramble to compete for a small pool of jobs at large prestigious companies or accept lower paid work at smaller companies. Many cannot find work and the unemployment rate for young people is 10%. These are political and social issues that go beyond digitalization.


Lim Wonhyuk, professor of economic development at the KDI School of Public Policy and Management was quoted in the New York Times offering this suggestion: “The government needs to nurture a business ecosystem that is more ably disposed to start-ups protecting their intellectual property rights and giving them better financial access and incubating and supporting them.” What is needed is a way to spur innovation and disrupt the business models of established businesses.


Up until recently South Korea was not known as a tech startup hub. Very few “Unicorns”, (startup companies that achieve $1 billion in valuation) have come from South Korea. That may be about to change according to Crunchbase, an  publisher of news covering the technology industry. Crunchbase has noted in a recent article, South Korea Aims for Startup Gold, that there is an increase in VC funding and a competitive lineup of potential Unicorns.


Another interesting country on the chart above is China. While this shows China to be very low on the digitization index, China leads the world in innovation when it comes to the number of Unicorns. According to Business Insider, Chinese retailers, Alibaba, and Moutai hold the top three spots on the list of top 20 fastest growing brands, while Amazon holds the thirteenth spot. That has a lot to do with the fact that China has the largest population in the world, close to 1.4 billion. That also lowers China’s GDP per capita to about 8600 USD. Consider what it would do to China’s GDP if the digitization rate were to increase by 10 points!


Digitization clearly shows a positive impact on GDP per capita. However, digitization does not occur without the focus of government since digitization has a lot to do with infrastructure, connectivity, and regulations. As a result, we are seeing government digitization initiatives around the world.

MyRepublic Candy.png

On one of my trips to Singapore in June 2016, I was invited to meet with a new startup in the telco space. A company called MyRepublic. MyRepublic was started in 2011 to leverage Singapore’s exciting Next Gen NBN roll-out – the first of many National Broadband Networks (NBN) happening in the Asia Pacific region. By January 2014, this startup was able to launch the world’s first 1Gbps broadband plan in Singapore under S$50. The Telco industry is a well-established industry with many large players, with billions of dollars in revenue, so I was interested in knowing how a startup could possibly disrupt the telco industry and be the first to launch such a service. This disruption was on the scale of an Uber in the transportation business or Airbnb in the hospitality business! Startups like MyRepublic have become known as Telcotechs, a term that is being increasingly used in the telco industry and is similar to the use of Fintech in the financial industry to refer to technically innovative companies that are disrupting their industries.

In an interview in 2014, their founder Malcolm Rodrigues attributed MyRepublic’s strong growth to what he called the “thin operator model” “We think we’ve re-engineered the economics of a telco,” said Rodrigues. “Today we bill and invoice about 25,000 customers. All the invoicing and the CRM system are in the cloud. When I was at a telco before MyRepublic, we spent about $300 million dollars on an IT platform. [At MyRepublic] we built a cloud-based CRM and accounting system using the best stuff available and stitching it together through open APIs. I’d say we spent about 80,000 bucks to do that, and our running cost is around 3,000 dollars a month”.

When asked how he expected to make money with such a small number of customers, he said “The telecom is a beautiful business. It’s all recurring revenue. When you’re selling software packages, you have to find new customers every month.” By offering a utility, customers tend to switch providers only after a long period of time. At that time in 2014, MyRepublic occupied about 1% of the internet service provider market with hopes to reaching 5% in a few years. While that is a small slice of the pie, they were able to triple their revenue that year from S$5M to S$15M. Rodrigues also believes that traditional telcos create a walled garden to discourage users from going to other services like WhatsApp or Skype. There is a new way of life that is coming that will require telcos to work closely with the content providers. He believes that MyRepublic must be committed to user experience and maintain credibility in the eyes of the public.

When I met with MyRepublic in 2016 they were providing ultra-fast internet service to over 30,000 homes and businesses in Singapore. Since they were cloud native they were not interested in Hitachi’s IT infrastructure solutions. Their interest was in the trends and directions in the industry and Hitachi’s IoT initiatives. Last week I participated in Hitachi Vantara’s Southeast Asia CIO and Partner conference where we were fortunate to have Eugene Yeo, Group CIO, for MyRepublic, as a main speaker. He provided an update and shared his experiences on driving business agility and transformation across their footprint which now includes Singapore, Australia, New Zealand and Indonesia, with plans for Cambodia, Myanmar, Malaysia, Philippines, Vietnam, and Thailand. Today, MyRepublic has a customer base reaching over 200,000 households.

Eugene Yeo.png

This summer MyRepublic moved into the telco space as a mobile virtual network operator (MVNO) by partnering with StarHub in Singapore and Tata Communications across Singapore, New Zealand, Australia, and Indonesia. An MVNO is a wireless communications services provider that does not own the wireless network infrastructure over which it provides services to its customers. It obtains bulk access to network services at wholesale rates, then sets retail prices independently, using its own customer services, and billing support systems. This enables MyRepublic’s telcotech platform expansion into home broadband and mobile services without having to make any capital investment into its own mobile network infrastructure or services management. MyRepublic is able to provide the most competitive pricing across all its many services.

Eugene is a firm believer of using IT as a strategic tool to ensure that MyRepublic remains innovative and ahead of the competition. The tech innovations at MyRepublic include embracing the cloud, open source, and a whole slew of disruptive technologies to grow the business. They developed their own business support systems (BSS) and operations support systems (OSS) and adopted public cloud for the BSS and OSS stack. Yeo is also a staunch supporter of open source. The company has partnered with RedHat to deploy OpenStack in early 2017. With a team of 70 to 80 engineers they use open source - from open source data base, to open source libraries, and workflow engines to get to the next level faster. They can deploy to new markets in less than 60 days!

Recently, MyRepublic engaged Hitachi to help accelerate the company’s telecommunications technology strategy and enhance business operations across their customer base. They selected the Hitachi Vantara’s Pentaho Data Integration and Business Analytics platform to help it expand further into the region and to launch mobile services on top of existing broadband services. MyRepublic is using the Pentaho platform to improve productivity around data storage and operational efficiency, and to provide enterprise-grade scalability. With the Pentaho platform, MyRepublic was able to integrate and blend data from disparate sources and then create the necessary dashboards with just two software engineers. Pentaho also allows MyRepublic to easily embed dashboards within its BSS, CRM and back-office systems so that users can access insights while working within the operational systems.

MyRepublic Pentaho.png

Eugene Yeo said. “While we have made significant manpower savings, the bigger benefit is the robust data pipeline we’ve been able to build. Pentaho allows us to add data to this pipeline rapidly, which is important to this vision. Similar to what fintech players achieved with the financial services industry, it paves the way for us to create new data monetization models that will lead to further innovation in the industry.”

MyRepublic has plans to IPO within 24 months. We are very proud to be a partner with them and pleased to be able to support them with our products and services.

VMworld 2018.png

Data centers are digitally transforming from being an infrastructure provider to a provider of the right service at the right time and the right price. Workloads are becoming increasingly distributed, with applications running in public and private clouds as well as in traditional enterprise data centers. Applications are becoming more modular, leveraging containers and microservices as well as virtualization and bare metal. As more data is generated, there is a corresponding growth in demand for storage space efficiency and storage performance, utilizing the latest in Flash technologies. Enterprises will need to focus on a more secure, automated data center architecture that enables micro-segmentation and artificial intelligence, while reducing complexity and increasing flexibility.


Infrastructure choices can significantly impact the success of digital transformation (DX). The inherent benefits of converged infrastructures (CI) and hyperconverged infrastructures (HCI) are gaining in popularity for enterprises on the DX journey. Converged and hyperconverged infrastructure can deliver business and IT value in new and evolving ways:

  • Improved system performance and scalability
  • Simplified management and operations
  • Single point of support
  • Reduced costs to deploy and maintain


Hitachi Vantara provides three unified compute platforms


Hitachi Unified Compute Platform CI Series(UCP CI)converged infrastructure systems combine the power of high- density Hitachi Advanced Server DS series servers built on the latest Intel Xeon Scalable Processor technologies and award-winning Hitachi Virtual Storage Platform (VSP) unified storage arrays. Flexible options let you configure and scale compute and storage independently to optimize performance for any workload, whether a critical business application or a new digital service. Automated management through Hitachi Unified Compute Platform Advisor (UCP Advisor) software simplifies operation and gives you a single view of all physical and virtual infrastructures from one screen. VSP G series and F series arrays take advantage of Hitachi’s unified technology and industry-leading all-flash features and performance. Networking is provided by a Brocade Fibre Channel, and Cisco IP switches. UCP CI targets midsize and large enterprises, departments and service providers who need the flexibility to combine mode-1 business applications with modern cloud-native services in a single, automated data center package.


Hitachi Unified Compute Platform HC (UCP HC) offers a scalable, simple and reliable hyperconverged platform. The UCP HC family simplifies the scale- out process and provides elasticity to closely align the IT infrastructure with dynamic business demands. Start with what you need and scale to keep pace with business growth, without committing massive capital upfront. UCP HC leverages x86 CPU and inexpensive storage, integrated with VMware vSphere and vSAN to reduce the total cost of ownership. It delivers high VM density to support a mix of applications, eliminating the need for storage sprawl. Modern data reduction technologies (deduplication, compression, erasure coding) reduce storage need by up to seven times to boost return on investment (ROI) by leveraging NVMe flash hyperconverged infrastructure. Recently introduced NVIDIA GPU-acceleration dramatically improves user experience in modern workspaces. You can read datasheet here. Success story of South-African retailer Dis-Chem pharmacy highlights the compelling business benefits of Hitachi Hyperconverged solution.


Hitachi Unified Compute Platform RS (UCP RS) seriesaccelerates your move to hybrid cloud by delivering a turnkey, fully integrated rack-scale solution, powered by VMware Cloud Foundation. This rack-scale UCP RS solution combines the hyperconverged Hitachi UCP HC system, based on industry-leading VMware vSAN, with NSX network virtualization (SDN) and software-designed data center (SDDC) management software to deliver a turnkey integrated SDDC solution for enterprise applications. The SDDC manager and Hitachi Unified Compute Platform Advisor (UCP Advisor) non-disruptively automate the upgrades and patching of physical and virtual infrastructure. They free IT resources to focus on other important initiatives. The SDDC manager automates the installation and configuration of the entire unified SDDC stack, which consists of nodes, racks, spine-leaf switches and management switches. UCP RS introduces new resource abstraction and workload domains for creating logical pools across compute, storage and networking. And dramatically reduces the manual effort required and fast-track the availability of IT infrastructure for private and hybrid cloud.


All these solutions will be on display at the Hitachi Vantara booth #1018at VMworld Las Vegas, August 26-30. Dinesh Singh has blogged about what we will be covering at this VMworld. There will be solution demos, breakout sessions where you will learn best practices from your peers like Conagra Brands and Norwegian Cruise Line. There will also be short and crisp, 15 min theater presentations running every hour, covering topics like NVMe storage, data analytics, SDDC hybrid cloud, and many more. Some of our alliance partners like Intel, Cisco, and DXC Technology will join some of these sessions. You will also be able to meet our technology experts who are developing the next cutting-edge solutions.


Be sure to save 4:30 PM on your schedule for August 28, for the popular Hall-Crawl where we will be serving delicious sushi and sake at Booth #1018.



SHARE is a volunteer-run user group for IBM mainframe computers that was founded in 1955 and is still active today providing, education, professional networking, and industry influence on the direction of mainframe development. SHARE member say that SHARE is not an acronym, it is what they do. SHARE was the precursor of the open source communities that we have today.


The mainframe market is alive and well and may be on the verge of a renaissance in the coming IoT age. We have all seen the staggering projections for 30+ billion new internet connected devices and a global market value of $7.1 trillion by 2020. That is almost 8 times the estimated 4 billion smartphones, tablets, and notebooks connected today. That translates into a staggering amount of additional new transactions and data, which means compute and data access cycles, as well as storage. That many new devices connected to the internet also opens up many more security exposures.


These are areas where the mainframes excel with their unique architecture of central processing units (CPUs) and channel processors that provide an independent data and control path between I/O devices and memory. With z/OS, the mainframe operating system, is a share everything runtime environment that gets work done by dividing it into pieces and giving portions of the job to various system components and subsystems that function independently. Security, scalability, and reliability are the key criterions that differentiate the mainframe; and are the main reasons why mainframes are still in use today especially in high transaction high security environments like core banking. These same capabilities will be required by the backend systems that support IoT.


Hitachi Vantara is one of the few storage vendors that support mainframes with its scalable VSP enterprise systems. We are a Silver Sponsor of the coming SHARE St Louis 2018 from August 12 -17. We will be sponsoring a booth #324, where you can meet with our mainframe specialists to answer your questions or input your requirements.  We will also be presenting the following topics:


Tuesday, August 14, 4:30 PM - 5:45 PM

Hitachi Mainframe Recovery Manager reduces Risk and simplifies Implementation Effort

Hitachi Mainframe Recovery Manager (HMRM) is a simpler, more focused and lower-cost streamlined mainframe failover and recovery solution which can provide you with the functionality you actually care about and nothing you don’t.

Room: Room 100

Session Number: 23402

Speaker: John Varendorff



Wednesday, August 15, 11:15 AM - 12:15 PM

Hitachi Vantara G1500 Update - Latest and Greatest

The Hitachi Virtual Storage Platform G1500 continues the evolution of Hitachi’s virtualized storage architecture with advances in the technology which have increased performance, expanded usability with active-active stretched clusters, extended storage virtualization with Virtual Storage Machines (VSMs), and added non-disruptive scale out capabilities to extend the life of the system.

Room: Room 267

Session Number: 22876

Speaker: Roselinda Schulman


Thursday, August 16, 12:30 PM -1:30 PM

Hitachi Vantara, Data Resilience and Customer Solutions for the Mainframe in the IOT Age

Please join Hitachi Vantara for a discussion on the IOT Age and how it effects mainframe environments today and moving forward into the future.  This session will discuss Data Resilience and its importance, Technologies for now and the future, and Analytics and its growing importance in companies across the globe.  We will discuss what solutions Hitachi Vantara has today for mainframe environments and where we may be going in the future.

Speaker: Lewis Winning


Please visit our booth and sessions at SHARE St Louis, and see how our mainframe systems, software, and services will support the security, scalability, and reliability required by the age of IoT.

It’s been two weeks since my cancer surgery and this is my first blog post since then. In Hitachi Vantara there is a major focus on developing IoT solutions in various industries including healthcare so my experience with chemotherapy, surgery, and hospital care was very interesting from a recipient point of view. It gave me a much greater appreciation for the tools that are required for healthcare today and where there needs to be improvements.



The first need is for the sharing of information between all the healthcare providers involved in a patient’s care and the optimization of the mix of medications and treatments that are required by each discipline, oncology, surgery, cardiology, endocrinology, urology, neurology, psychology, etc. Many of these medications have side effects which may interfere with different treatments, and some have to be suspended or altered when a new disease or condition needs to be treated. Each condition alone has many different treatment options. In my case the DNA of my cancer cells are being analyzed to determine the best treatment. Hitachi provides content platforms and content intelligence systems for centralizing and sharing large volumes of data. Hitachi also has many projects applying AI and machine learning tools, working with major medical facilities around the world. Most that I know of are around targeted studies, like optimizing the mix of medication for personalized diabetes control or cardiac sleep studies, so there is a lot to learn through AI and machine learning about the interactions of different medical conditions.


My chemo treatment consisted of a portable pump which I carried for 48 hours every two weeks which enabled me to continue most of my work prior to the recent surgery. The purpose of the chemo was to isolate and shrink the tumors which were in my liver and small intestine, so that they could be surgically removed. Chemo treatment consists of infusing your body with a cocktail of poisons that inhibit the growth or kill the cancer cells.  Unfortunately, it kills healthy cells as well. The purpose of the DNA study was to find a way for my immune system to attack this particular cancer. That study is still on going. The chemo mix that was given was called "5FU". If you look it up on social media, it is often referred to as “5 feet under” for its effect on some patients. There needs to be a faster way to develop safer immunological treatments. Since immediate treatment was required, I opted not to wait for the DNA study and went with the chemo and surgery.


The surgery was another area for advanced technologies. The surgeon described the procedure which was amazing. Mounted in front of him in surgery were two screens which showed an MRI scan and a CT scan that were taken earlier to identify the locations of the lesions. These are cross-sectional views of my body which he has to correlate with the longitudinal view of what he sees on the table in front of him. What he sees, is through a 6-inch incision in my abdomen. The MRI and CT scans are point in time views of flexible organs which are changing in real time as the surgeon starts to work on the organs. The targeted lesions are in organs that are hidden behind other tissues and organs. Somehow, using a real-time ultrasound imaging system during surgery, he was able to locate the lesions, which are as small as several mm in size, and excises them. The large tumor in the small intestine had to be extracted from the surrounding tissues and lymph nodes and the intestine reconnected to the stomach. He removed these tissues which will be preserved for future studies. Prior to the operation I signed a permission that allows Stanford Research to use these tissue samples until the year 2502! The surgery took 5 hours of very concentrated, intricate, skillful work! You can imagine the physical strain that this relatively minor surgery would put on the surgical team. Imagine what an organ transplant would require! While Hitachi Healthcare provides tools to assist surgeons, like the real-time ultrasound imaging systems that was used in this surgery, it may be some time before robots will be able replace the skilled surgeons in these types of surgeries.


This experience has given me a much greater appreciation for the possible benefits of AI, machine learning, and robotics that companies like Hitachi are working on. Most of you know someone who has gone through similar life experiences, perhaps even in your own lives. This gives a new level of urgency to what we do at work. It’s not just about the pay and recognition. It is really about social innovation and the difference it can make in our lives.

Bill Schmazo.png


I am very pleased to welcome Bill Schmarzo to our Hitachi Vantara community. Bill is very well known in the Big Data Community having authored two books, “Big Data: Understanding How Data Powers Big Business” and "Big Data MBA: Driving Business Strategies with Data Science."  He is considered the Dean of Big Data. He’s an avid blogger and frequent speaker on the application of big data and advanced analytics to drive an organization’s key business initiatives. Bill teaches at the University of San Francisco (USF) School of Management, where he is their first Executive Fellow.


Bill joins Hitachi Vantara as a Big Data Analytics Visionary, CTO IoT and Analytics. In this role he will guide the technology strategy for IoT and Analytics. With his breadth of experience delivering advanced analytics solutions, Bill brings a balanced approach regarding data and analytic capabilities that drive business and operational outcomes.  Bill will drive Hitachi Vantara’s “co-creation” efforts with select customers to leverage IoT and analytics to power digital business transformations.  Bill’s background includes CTO at Dell EMC and VP of Analytics at Yahoo.


When he joined Hitachi Vantara, he posted two blogs on LinkedIn which I am linking to below to help you understand his thinking and his perspective on what we are doing at Hitachi Vantara.


In the first post, Bill’s Most Excellent Next Adventure, he explains why he left Dell EMC with a heavy heart after 7 marvelous years and decided to join Hitachi Vantara. Here is a quote from that post. Please take the time to read the complete post.


I will be joining Brad ( Brad Surak, Chief Product and Strategy Officer ) at Hitachi Vantara – the digital arm of Hitachi – as their Chief Technology Officer for Internet of Things (IOT) and Analytics.  I will have a chance to leverage nearly everything that I have been working over the past 20 years.  And instead of just talking, writing and consulting on Digital Transformation, this time I will get the chance to actually do it; to get my hands dirty and bloody my elbows making it all work.  It truly is the best job I could have dreamed off, and if this will be the final chapter in my working career, then damn it, let’s make it a great one!

And I promise to take you all along on this most excellent adventure.  You can learn first-hand what’s working, what’s not working and what we all can do to make Digital Transformation a reality. You will learn from my successes and learn even more from my failures.  And in the end, I hope that we will all be a bit smarter from the experience.”


In his second post, My Journey Through the Looking Glass: Hitachi Vantara Introduction Bill gives his impressions on his first week in Hitachi Vantara.


I spent my first week at Hitachi Vantara attending sales kickoff, and wow, was I impressed! Hitachi Vantara is making a pivot to build upon its core product and services strengths to deliver data-driven innovations around data monetization and business outcomes. And I could not be in a happier place (sorry Disney), as this is everything that I have been teaching and preaching about the past 30+ years; that organizations must become more effective at leveraging data and analytics to power our customers’ business models.”


In this post he goes on to give his basic views on Big Data, digital transformation, IoT, and analytics. Business Model Maturity, helping customers with their digital journey, the why of digital transformation (starts with the business), monetizing your IoT, Digital Transformation value creation, Digital Transformation customer journey mapping, and providing links to previous blogs in which he addressed these topics. Please read this post and link to these topics for a wealth of great insights. I expect to see even more as Bill begins posting on our Hitachi Vantara Community.


He concludes this post with the following comment:


“I can’t say enough about how energized I am about where Hitachi Vantara is going from an IOT, analytics and digital transformation perspective. It matches everything that I have been teaching and preaching over the past many years, and I am eager to build it out in more detail via hands-on customer engagements.”


I have been reinvigorated just by meeting Bill and hearing his validation of our strategy and direction. Please welcome Bill Schmarzo to our Hitachi Vantara Community!

Hu Yoshida

Beyond Cool!

Posted by Hu Yoshida Employee Jun 20, 2018

In my last blog post I commented on Hitachi Vantara’s selection as one of the “Coolest Business Analytics vendors”by CRN, Computer Reseller News, and expanded on Hitachi Vantara’s business analytics capabilities. CRN’s report positions business analysis tools at the top of the big data tools pyramid to derive insight and value from the ever-growing volume of data. In this post I will be expanding on how we address the rest of the big data pyramid.


Data Fabric.png


Other analysts and trade publications like Network World also refer to this as a Big Data Fabric which is gaining more attention as analytics become a driving force in driving business outcomes. Whether you think of this as a big data pyramid or a big data fabric, the concept is a converged platform that supports the storage, processing, analysis, governance and management of the data that is currently locked up in different application silos and is the biggest hurdle to overcome for developing meaningful and accurate business analytics.


A comprehensive big data pyramid or big data fabric must provide features and functionality, such as data access, data discovery, data cleansing, data transformation, data integration, data preparation, data enrichment, data security, data governance, and orchestration of data sources, including support for various big data fabric workloads and use cases. The solution must be able to ingest, process, and curate large amounts of structured, semi-structured, and unstructured data stored in big data platforms such as Apache Hadoop, MPP, EDW, NoSQL, Apache Spark, in-memory technologies, and other related commercial and open source projects, and do it simply, efficiently, and cost effectively.


The strength of the information fabric you weave is directly affected by the quality of data you stitch together. For any organization that is actively investing in anything remotely close to a data lake, the focus must be on data quality before data use.  A key point that many organization seems to miss in determining the worth of their data is that just because data is being collected, that does not mean that organizations are collecting the right data.  They may be either collecting very little of something very important or not collecting the right data at all. Data quality impacts business effectiveness


Effective data quality is only reliable when it occurs as close to the point of data creation, and long before it is given asylum in the data center or blended downstream for some other business purpose.  This is important because it is here that Hitachi Vantara really shines.  Both Hitachi Content Intelligence and Pentaho can be used as, “Data Quality Gateways” designed and implemented in the data stream to affect data veracity and bolster the source of truth expected out of the information fabric.  Regardless of whether we are talking about discovery, orchestration, management, governance, control, preparation, etc. focusing on the quality and correctness of data is what makes the information fabric reliable and trustworthy.  More importantly, when you perform these veracity activities is up to you – that is the power offered by Hitachi Vantara’s solutions.  Certainly, we would continually suggest that it be well before data is stored in your data center, but we do not force that best practice on our customers.


Additionally, our two products allow you to sub-segment your dataset based on the business outcome that is desired.  For example, if you are working with data sets in a manner that allows you to answer very specific questions based on known datasets in a time-sensitive manner, Pentaho provides the right solution for you with additional capabilities to blend and visualize that data.  If the business outcome is based more on exploratory activities across multiple datasets and allows for time to conduct the exploration, then Hitachi Content Intelligence provides the ideal solution.  Both can process the data instream and at contextual levels.  Both offer the ability to leave the data where it resides or augment and migrate the data to our Data Services Platform (Hitachi Content Platform) where it can be stored for its lifespan in a compliant, self-protected, and secure manner.


Data is a reusable commodity, where value may be gained from different data points and can continue to provide additional and valuable insights that may not have been imagined in the first analysis. Hitachi Content Platform is the ideal repository for long term retention of Big Data with its geo-no backup-data protection, secure multitenancy, governance and security features, extensible metadata, self-healing reliability and availability, low cost erasure coding storage, cloud gateway, and the speed and scalability that leverages the latest advances in infrastructure technology.


This is a very high-level view of what we provide for Big Data Fabrics. In subsequent posts, I will expand on some of these concepts which differentiate us in the big data space. Hitachi Vantara has the most comprehensive set of big data and big data analytics tools built around our integrated Hitachi Content Platform with Hitachi Content Intelligence, and Pentaho solution sets.

Hu Yoshida

How Cool Is That!

Posted by Hu Yoshida Employee Jun 12, 2018

CRN, Computer Reseller News, a leading trade magazine, has named Hitachi Vantara as one of the 30 Coolest Business Analytics Vendors.This may be a surprise for many to see Hitachi Vantara, part of a 118 year old company with traditional values like Harmony (Wa), Sincerity (Makoto), and Pioneering Spirit (Kaitakusha-Seishin), in the middle of a list of technology startups.  Hitachi and Hitachi Vantara considers business analytics to be one of the key drivers for our customer’s success in this age of big data, digital transformation and IoT and is approaching business analytics with the same “startup” or pioneering spirit that has sustained us for over 118 years.


Cool Analytics.png

Hitachi Vantara’s appearance in this list of 30 “cool” companies may also be surprising from a “coolness” standpoint. Most of these companies are hip new startups. The next oldest company is Microsoft, who like us, have had to reinvent themselves, many times to remain relevant.


Actually, Hitachi Vantara is the new kid on the block since it was formed in September of 2017 with the merger of Hitachi Data Systems (IT infrastructure systems), Hitachi Pentaho (Data Integration and analytics), and Hitachi Insights (IoT). CRN recognizes that Hitachi Vantara is able to provide, “cloud, Internet of Things, big data, and business analytics products under one roof.” CRN cites Pentaho as a core Hitachi Vantara product for data integration, business analytics and data visualization. CRN also mentioned Pentaho’s new machine learning orchestration tools, available as a plug-in through the Pentaho Marketplace, to help data scientists, better monitor, test, retrain, and redeploy predictive models in production


We have registered over 1,500 licenses of Pentaho enterprise users. However, since Pentaho is open source, with a thriving community of open source users, there are hundreds of thousands of open source users and we are adding about 5K to 10K new users per week. While Pentaho positions us to have a place on this list, there is much more to what Hitachi Vantara can provide for big data and business analytics.


CRN’s report positions business analysis tools at the top of the big data tools pyramid to derive insight and value from the ever-growing volume of data. Hitachi Vantara focusses on the entire pyramid since the insights and value are only as good as the data that goes into it.


While Pentaho is a core product in our analytics portfolio, we have other analytic tools like:

  • Hitachi Content Intelligence is part of our Hitachi Content portfolio that automates the extraction, classification, enrichment, and categorization of data residing on Hitachi Vantara and third party repositories, on premise or in the cloud.
  • Hitachi Data Streaming Platform provides proactive data streaming analytics to transform streaming IoT data to valuable business outcomes.
  • Hitachi Video Analytics can drive new business success through insights into customer behavior and preferences.
  • Hitachi Infrastructure Analytics Advisor uses machine learning to prescribe optimal IT infrastructure performance SLAs to improve user satisfaction, simplify budget forecasting with predictive analysis, and accelerate fault resolution using AI to diagnose root cause analysis, prescribe resolution and enable admins to automate fixes.


Hitachi Vantara also has the good fortune to be part of a larger Global Hitachi corporation that has operational expertise in many industries, from healthcare, to energy, to transportation systems. This expertise is critical in developing industry or business specific analytic models and automation tools that drive business outcomes.


CRN put together this list of 30 business analytics companies for the following purpose:


“…we've put together a list of 30 business analytics software companies that solution providers should be aware of, offering everything from simple-to-use reporting and visualization tools to highly sophisticated software for tackling the most complex data analysis problems.”


Hitachi Vantara is proud to be recognized as one of the 30 Coolest Business Analytics Vendors by CRN Big Data 100. We congratulate the other members of this list. Since big data and analytics requires an ecosystem of vendors, I am sure that we will be working with many of these vendors as we are already working with vendors like Microsoft and SalesForce. We will be working with many more vendors and customers as we continue to develop the pyramid of big data tools that will be required to address our customer’s business requirements.


"Cool" wasn't in anybody's vocabulary 118 years ago, but the essence was captured in Harmony, Sincerity, and Pioneering spirit.