Over the past year there has been a surge in reports of data breaches from Amazon S3 buckets that were left accessible online, exposing private information from all sorts of companies and their customers. The list of high profile victims includes, Accenture, Booz Allen Hamilton, Dow Jones, Time Warner, Tesla, TSA, Verizon, and the U.S. National Geospatial Intelligence Agency, and resulted in the exposing of hundreds of millions of private records to public access.
How did this come about? When you create an Amazon S3 (Simple Storage Service) cloud storage bucket, you can store and retrieve data from anywhere on the web. In most of the breaches, the companies left Amazon S3 buckets configured to allow public access. This means that anyone on the internet with the name of the S3 bucket could access and download its content.
According to statistics by security firm Skyhigh Networks, 7% of all S3 buckets have unrestricted public access, and 35% are unencrypted, meaning this is a major problem for the entire Amazon S3 ecosystem. To compound the problem there are many tools available on the web to comb through leaky AWS datasets, so finding these exposed S3 buckets is relatively easy. BuckHacker is a search engine tool that provides the most accessible way to search for exposed S3 buckets.
While encryption is one way to limit the consequences of an S3 data breach many people might think they are protected by encryption when they actually aren’t. People believe that if the data is encrypted, then the data content cannot be rapidly compromised. It depends on where the encryption is done. The most easily-adopted approach to encryption, Server Side Encryption (SSE) doesn’t solve the open bucket problem. With server-side encryption (SSE) Amazon S3 will encrypt your data before saving it to disks in its datacenter, but the data is automatically decrypted when the data is downloaded – so data in an encrypted open bucket is accessible in clear text despite being stored in an encrypted manner.
Protection is provided by client-side encryption, where a client such as Hitachi Content Platform (HCP) encrypts the data and then uploads the encrypted data to Amazon S3. In this case the client manages the encryption process, the encryption keys, and related tools, and maintains ownership of the encryption keys. And even if the bucket is publicly readable, the content is still encrypted when it is accessed.
Another solution for S3 data leaks is to take an on-premises or hybrid approach with HCP as a client to S3. HCP is designed to run in the datacenter or as a hybrid cloud where layers of existing enterprise security processes and protocols already keep hackers out. HCP avoids risk thanks to the enterprises’ custom security that is well understood and trusted, whereas systems that are run as a public cloud have a much broader attack surface with a one-size-fits-all security approach. Running the system on-premises minimizes the risk of accidental public exposure of confidential data. In this way you can take advantage of the scalability and low cost of Amazon S3 storage while enjoying the protection provided by your enterprise firewall, and access controls, along with HCP’s data protection services like data at rest and data in flight encryption. See my previous post on HCP security features. All these HCP security features remain in force when tiered to S3 buckets.
An on-premises or hybrid approach with HCP as a client to S3 can provide another layer of protection if hackers should be able to penetrate AWS and gain access to the underlying computing infrastructure, which may enable direct access to the physical media that Amazon uses for S3 or any backup or replicas. Since the data would have been encrypted by HCP, the hackers would be unable to access the data. Also, since Amazon is a public cloud service, there is always the possibility of a National Security Letter or other legal hold being placed on the data, which would require users to retain their files in another location that could also be attacked.
On premises or hybrid HCP running as a client to Amazon S3 services is the best way to plug leaky Amazon S3 buckets.