Home > Blog

How security researchers discover open Amazon S3 servers

11.12.2017

Recently, we all got stormed by the data leak incidents happening almost every month. A big part of these incidents is related to misconfigured Amazon S3 (AWS) cloud storage servers. Many corporations and governmental organizations store sensitive information in publicly accessible AWS S3 buckets. To name a few: Uber, Viacom, Accenture, and even the NSA and the Pentagon. We already wrote about some cloud data leaks in the Corporate data rainstorms and drizzling from the Cloud Lets stop the bigger threat blog post.

Unfortunately, IT-administrators just do not properly control what access types they defined for what buckets. It appears like some of them are thinking that permissions are just too tricky, and it is just easier to allow anyone to access the storage rather than set up and maintain a proper access control list (ACL). This is especially egregious when the bucket name is a secret, such that no one else should be able to find and access it. :-) Experienced information security specialists should be familiar with that behavior we call it security by obscurity.

I have to say that Amazon is trying to avoid permissions misconfiguration at almost every step. For example, it warns you when the Everyone user context is allowed to access uploading files.

AWS configuration - Data Leakage Prevention by DeviceLock DLP

A rhetorical question: why is the sensitive data (including NSA and Pentagons files with Secret and Top Secret classifications) stored on the public cloud servers at all? And, if there is a reason for that, why isnt anybody doing a data discovery procedure on the content of these files to prevent leaks!?! Solutions like DeviceLock Discovery can automatically scan files in folders of cloud file sharing applications, locate documents with prohibited content, and eliminate discovered policy violations by enforcing various remediation actions, as well as initiating incident management procedures.

I do not want to deep dive into the technical stuff, and so I will try to explain how security researchers are discovering open AWS buckets as simply as possible.

First, there is not an official list of AWS buckets. Of course, Amazon knows all the names, but such information is unavailable to others. Hence, to find a bucket you need to guess its name. The most productive way is to apply a dictionary attack. A dictionary would include company names (e.g. Uber), some specific technical words (e.g. production, backup) and so on. Then a special script takes words from this dictionary and creates bucket names by trying to apply all possible word combinations. For example, for Uber it would find the uber.s3.amazonaws.com bucket (by the way, this bucket does not belong to Uber).

Second, many of the buckets do have proper permissions assigned, so you will not be able to access all of them. When trying to open such a bucket, you get an Access Denied error. Good scripts can check discovered buckets for public accessibility and even traverse nested directories and showing all the files there as well.

Discovering AWS buckets - Data Leakage Prevention by DeviceLock DLP AWS buckets listing - Data Leakage Prevention by DeviceLock DLP

Third and most important, in case the security researchers have finally discovered some sensitive information, they usually notify a bucket owner about the data breach. However, in the Uber case it was a little bit different story. :-)

At the end, I would provide some useful links:

Author: Ashot Oganesyan, Co-Founder and CTO, DeviceLock Inc.