Researchers at UpGuard, a cybersecurity firm, found troves of user information hiding in plain sight, inadvertently posted publicly on Amazon.com Inc.’s cloud computing servers. The discovery shows that a year after the Cambridge Analytica scandal exposed how unsecure and widely disseminated Facebook users’ information is online, companies that control that information at every step still haven’t done enough to seal up private data.
In one instance, Mexico City-based digital platform Cultura Colectiva, openly stored 540 million records on Facebook users, including identification numbers, comments, reactions and account names. The records were accessible and downloadable for anyone who could find them online. That database was closed on Wednesday after Bloomberg alerted Facebook to the problem and Facebook contacted Amazon. Facebook shares pared their gains after the Bloomberg News report.
Another database for a long-defunct app called At the Pool listed names, passwords and email addresses for 22,000 people. UpGuard doesn’t know how long they were exposed, as the database became inaccessible while the company was looking into it.
Facebook shared this kind of information freely with third-party developers for years, before cracking down more recently. The problem of accidental public storage could be more extensive than those two instances. UpGuard found 100,000 open Amazon-hosted databases for various types of data, some of which it expects aren’t supposed to be public.
“The public doesn’t realize yet that these high-level systems administrators and developers, the people that are custodians of this data, they are being either risky or lazy or cutting corners,” said Chris Vickery, director of cyber risk research at UpGuard. "Not enough care is being put into the security side of big data."
Cultura Colectiva is a digital platform that posts stories about celebrities and culture and largely targets a Latin American audience. The company’s website says it creates content through data and technology and has more than 45 million followers on Facebook, Instagram, Twitter, YouTube and Pinterest.
Facebook for many years allowed anyone making an app on its site to obtain information on the people using the app, and those users’ friends. Once the data is out of Facebook’s hands, the developers can do whatever they want with it.
About a year ago, Facebook Chief Executive Officer Mark Zuckerberg was preparing to testify to Congress about a particularly egregious example: A developer who handed over data on tens of millions of people to Cambridge Analytica, the political consulting firm that helped Donald Trump on his presidential campaign. That one instance has led to government probes around the world, and threats of further regulation for the company.
Last year, Facebook started an audit of thousands of apps and suspended hundreds until they could make sure they weren’t mishandling user data. Facebook now offers rewards for researchers who find problems with its third-party apps.
A Facebook spokesperson said that the company’s policies prohibit storing Facebook information in a public database. Once it was alerted to the issue, Facebook worked with Amazon to take down the databases, the spokesperson said, adding that Facebook is committed to working with the developers on its platform to protect people’s data.
In the Cultura Colectiva dataset, which totaled 146 gigabytes, it was difficult for researchers to know how many unique Facebook users were affected. UpGuard also had trouble working to get the database closed. The firm sent emails to Cultura Colectiva and Amazon over many months to alert them to the problem. It wasn’t until Facebook contacted Amazon that the leak was addressed. Cultura Colectiva didn’t respond to Bloomberg’s request for comment.
This latest example shows how the data security issues can be amplified by another trend: the transition many companies have made from running operations predominantly in their own data centers to cloud-computing services operated by Amazon, Microsoft Corp., Alphabet Inc.’s Google, and others.
Those tech giants have built multibillion-dollar businesses by making it easy for companies to run applications and store troves of data, from corporate documents to employee information, on remote servers.
Programs like Amazon Web Services’ Simple Storage Service, essentially an internet-accessed hard drive, offer clients the choice of whether to make the data visible to just the person who uploaded it, other members of their company, or anyone online. Sometimes, that information is designed to be public-facing, as in the case of a cache of photos or other images stored for use on a corporate website.
Other times, it isn’t. In recent years, information stored on several cloud services -- U.S. military data, personal information of newspaper subscribers and cell phone users -- has been inadvertently shared publicly online and discovered by security researchers.
Amazon in the last two years has beefed up protocols to keep customers from exposing sensitive materials, adding prominent warning notices, making tools for administrators to more simply turn off all public facing items, and offering for free what was formerly a paid add-on to check a customer’s account for exposed data.
“Originally I would have put a lot of this on AWS,” said Corey Quinn, who advises businesses that use Amazon’s cloud at the Duckbill Group, a consulting firm. But since Amazon has taken steps to address the issue, companies like Cultura should be aware, he said. “With all of this in the news, and all of this continuing to come out, if you’re still opening AWS buckets [to the public], you’re not paying attention.”
Amazon isn’t the only company that periodically gets caught up in cases of private records mistakenly made public. But it has a wide lead in the business of selling rented data storage and computing power, putting a spotlight on Seattle-based company’s practices. An Amazon Web Services spokesman declined to comment.