Make privacy great again —

Facebook’s Cambridge Analytica scandal, explained [Updated]

Trump operatives got private data from 50 million Facebook users.

Facebook’s Cambridge Analytica scandal, explained [Updated]

Update: Cambridge Analytica has suspended CEO Alexander Nix. In addition to controversy over unauthorized access to private Facebook data, Nix is also facing a scandal over comments captured by hidden cameras. In those videos, Nix boasts about using dirty tricks—including staged bribery attempts and sending prostitutes to seduce political opponents—to win elections.


Facebook is reeling from a series of revelations about private user data being leaked to Cambridge Analytica, a shadowy political consulting firm that did work for the Donald Trump campaign.

Last Friday, reporters from The New York Times and The Observer of London told Facebook that Cambridge had retained copies of private data for about 50 million Facebook users. Facebook says Cambridge promised in 2015 that the data would be deleted. Facebook responded to the new revelations by banning Cambridge and several of its associates from Facebook.

But this week the controversy surrounding Facebook's ties to Cambridge—and its handling of private user data more generally—has mushroomed. British members of Parliament accused Facebook of misleading them about the breach and asked CEO Mark Zuckerberg to come to the UK to clear up the issue personally. Facebook has scheduled a surprise all-hands meeting to answer employee questions about the controversy.

The scandal has attracted broad public interest because Cambridge did millions of dollars in political consulting work for Donald Trump's presidential campaign. Some reports have portrayed the firm as the masterminds behind Trump's election victory—and the stolen Facebook data as a key part of Trump's digital strategy. But while the firm's controversial psychographic techniques have attracted a lot of attention, there's reason to doubt that they were actually used in the 2016 election.

The larger concern for Facebook is that the Cambridge leak could be seen as just one example of a broader pattern of lax handling of confidential user data. Facebook offers users privacy controls that are supposed to limit who has access to their data—and Facebook has promised the Federal Trade Commission that it will ensure those settings are honored.

But recent reports indicate that Facebook's privacy measures haven't been effective. That could damage users' faith in Facebook's privacy promises. And it is already attracting scrutiny from government regulators, both in the United States and Europe, who want to know why Facebook didn't do a better job of protecting customers' private information.

Cambridge probably didn’t use illicit Facebook data to help Trump

Before the 2016 election, Cambridge Analytica was an obscure consulting company funded by the family of conservative hedge fund mogul—and Republican political donor—Robert Mercer. But in the weeks after the 2016 election, rumors began to circulate that Cambridge had played a key role in Donald Trump's victory.

Where conventional political advertising uses crude demographic factors like age and ZIP code to target advertising, Cambridge supposedly used a technique called psychographics, which involves building a detailed psychological profile of a user that will allow a campaign to predict exactly what kind of appeal will be most likely to convince any particular voter.

A widely read January 2017 article in Motherboard points to a September 2016 talk by Cambridge Analytica CEO Alexander Nix in which he described the technique:

Nix shows how psychographically categorized voters can be differently addressed, based on the example of gun rights, the 2nd Amendment: "For a highly neurotic and conscientious audience the threat of a burglary—and the insurance policy of a gun." An image on the left shows the hand of an intruder smashing a window. The right side shows a man and a child standing in a field at sunset, both holding guns, clearly shooting ducks: "Conversely, for a closed and agreeable audience. People who care about tradition, and habits, and family."

Motherboard—and other reports circulating around the same time—portrayed Facebook-derived data as the key to this targeting effort. They pointed to work by researcher Michal Kosinski that found he could predict a lot about a person based on Facebook likes. Kosinski created an online personality quiz that required users to log in to Facebook to take it. Once users logged in, he collected data from the user's Facebook profile, including the list of pages they have "liked." The quiz was a hit, and Kosinski soon had a large database of people's private Facebook data. And he found that Facebook data was a surprisingly good predictor of other demographic and personality traits.

"On the basis of an average of 68 Facebook 'likes' by a user, it was possible to predict their skin color (with 95-percent accuracy), their sexual orientation (88-percent accuracy), and their affiliation to the Democratic or Republican party (85 percent)," Motherboard reported.

A psychology professor named Aleksandr Kogan who was working with Cambridge Analytica asked to buy Kosinski's data. When Kosinski declined, the Times reports that Cambridge paid Kogan more than $800,000 to create his own personality app to harvest data from Facebook users.

Kogan attracted 270,000 Facebook users to take the online personality quiz. But Facebook's APIs at the time allowed Kogan's app to also collect a broad range of information about each authorized user's friends. The average Facebook user has hundreds of friends, so Kogan was able to leverage his user base of 270,000 people to harvest data for about 50 million Facebook users.

Facebook says Kogan told them he was collecting the data only for academic purposes. But that wasn't true. Kogan shared this data with Cambridge Analytica for use in its ad-targeting work.

Cambridge then won the Ted Cruz presidential campaign as a client. In December 2015, The Guardian revealed that Cambridge was using data harvested from Facebook in its work on the Cruz campaign. According to the Times, Facebook quietly verified the leak and took steps to secure the data—securing a promise from Cambridge that the data had been deleted. More recent reporting suggests that that wasn't true.

While Cambridge got a lot of press coverage—both before the 2016 election and after it—post-election reporting has cast doubt on the effectiveness of Cambridge Analytica's methods. The New York Times reported last year that "Cambridge's psychographic models proved unreliable in the Cruz presidential campaign, according to Rick Tyler, a former Cruz aide, and another consultant involved in the campaign. In one early test, more than half the Oklahoma voters whom Cambridge had identified as Cruz supporters actually favored other candidates. The campaign stopped using Cambridge's data entirely after the South Carolina primary."

"After the Cruz campaign flamed out, Mr. Nix persuaded Mr. Trump's digital director, Brad Parscale, to try out the firm," the Times added. "Its data products were considered for Mr. Trump's critical get-out-the-vote operation. But tests showed Cambridge's data and models were slightly less effective than the existing Republican National Committee system, according to three former Trump campaign aides."

But the biggest problem for the theory that stolen Facebook data was the key to Trump's election is this: according to a March 2017 Times story, "Cambridge executives now concede that the company never used psychographics in the Trump campaign." Other reporting around the same time reached the same conclusion.

Indeed, this becomes clear if you read that 2017 Motherboard article carefully. As an example of Trump's ad targeting techniques, Motherboard reported that "in the Miami district of Little Haiti, Trump's campaign provided inhabitants with news about the failure of the Clinton Foundation following the earthquake in Haiti, in order to keep them from voting for Hillary Clinton."

You can debate whether this amounts to a political dirty trick or legitimate campaign criticism. But it's definitely not an example of cutting-edge psychographic profiling. Facebook offers every advertiser the ability to target ads based on conventional demographic criteria like race and ZIP code. This kind of message targeting didn't require using purloined Facebook user data to build psychographic profiles of voters.

The Cambridge breach could be the tip of the iceberg

So the stolen Facebook data doesn't seem to have played a significant role in getting Donald Trump elected president. Nevertheless, the controversy surrounding Cambridge has highlighted the fact that, at least until recently, Facebook has not done a good job of safeguarding private user data.

On Tuesday, The Guardian published an interview with Sandy Parakilas, a "platform operations manager at Facebook responsible for policing data breaches by third-party software developers between 2011 and 2012." According to Parakilas, Facebook required app developers to sign agreements promising to abide by privacy restrictions attached to user data they received through Facebook APIs, but enforcement of these requirements was extremely lax.

"My concerns were that all of the data that left Facebook servers to developers could not be monitored by Facebook, so we had no idea what developers were doing with the data," Parakilas told The Guardian. "Once the data left Facebook servers, there was not any control, and there was no insight into what was going on."

Parakilas said that "it has been painful watching" the unfolding scandal around Cambridge Analytica, because "I know that they could have prevented it."

"In the time I was there, I didn’t see them conduct a single audit of a developer’s systems," he said. Parakilas estimates that tens—possibly hundreds—of apps took advantage of this opportunity to harvest the data of the Facebook apps' unwitting friends.

One of the most controversial features of Facebook's APIs for third-party apps was known as "Friends Permission." This feature gave developers—including Aleksandr Kogan, the professor whose personality quiz app harvested data for Cambridge—the ability to not only gather information about their own users but also to get data about their friends. Facebook's lax approach to privacy in its early years aided its rapid growth, as it enabled the creation of viral hits like the Farmville game app.

Facebook put an end to the Friends Permission feature in 2014, according to The Guardian.

The mere fact that Facebook allowed so much nominally private data to leak to third parties would be embarrassing enough. The larger concern for Facebook is that the company signed a deal with the Federal Trade Commission in 2011 that was specifically focused on enforcing user privacy settings. Two former FTC officials told The Washington Post this week that allowing user data to be disclosed to third parties may have violated the terms of that 2011 agreement, which could potentially expose Facebook to large fines.

And Facebook is about to get a lot of unwelcome scrutiny from regulators in Europe, which generally has stricter privacy laws than the United States. British privacy regulators have raided the offices of Cambridge Analytica to determine whether the firm still has illicit Facebook user data. And European regulators say they are "following up" with Facebook to make sure the social media giant is complying with applicable EU regulations.

Channel Ars Technica