Reading Time: 8 minutes

The following is a summation and commentary on a recent article in Antidote on the Big Data and psychometric technology behind Cruz’s rise in Conservative estimation before ultimately losing to Trump, and then Trump’s campaign’s use of the same technology. The article being a translation of a piece that ran in the Swiss publication Das Magazin in December.

I want to highlight a few key elements from the Antidote translation, and some snippets from linked sources that appear in the article, so you may wish to read that first, or you can treat this is a TL;DR for it (the original is around 5000 words, plus several thousand more in the linked material, this is a 10 minute read, at just under 2000 words).

What is Big Data?

As the name suggests, it is the storage and use of vast amounts of data. The difference between traditional data and Big Data is possibly best illustrated with a comparison. Consider someone buying a product in a shop: at the point of sale you capture the item (or items), the unit cost, the method of payment (and a bit more besides, if by card), and the date and time of the purchase – all very useful. Then came eCommerce. Consider the same transaction occurring on a website: all of the above data; the individual shopper’s journey through the virtual store, what was and was not clicked on, time spent per page, etc.; you may know the site that brought them to this shop (equivalent to knowing whether the shopper came to your store from the food court or a competitor); the payment method is also electronic (be it card, PayPal, or other), and thus comes with its own information (as above, but more so); if the visitor is regular (and you use cookies), you will have both purchase history and previous site-visit behaviour; and, finally, if the visitor has a cookie from any number of other sites, a broader viewing and purchasing picture can be derived. Traditional data is half a dozen data points per transaction, Big Data is dozens, hundreds, or even thousands of such data points, and importantly, the ability to do something with that data.

What is Psychometrics?

Any attempt to measure a person’s personality is psychometrics. I’ve mentioned trait theory and the OCEAN model in other posts (a 100-item OCEAN inventory can be taken, here), but equally Haidt’s Moral Foundations Questionnaire is psychometrics (as are all of the other measures available on Haidt and colleagues’ site). So, too, therefore, is the perennial FaceBook (and OKCupid) favourite, Myers-Briggs.

With these five dimensions (O.C.E.A.N.), you can determine fairly precisely what kind of person you are dealing with—their needs and fears as well as how they are likely to behave. For a long time, however, the problem was data collection, because to produce such a character profile meant asking subjects to fill out a complicated survey asking quite personal questions. Then came the internet. And Facebook. And Kosinski.

Michal Kosinski

In 2008, Michal Kosinski, a Polish psychology student, was accepted to the University of Cambridge’s Psychometrics Centre. His doctoral work was to be with David Stillwell, utilizing the data gained from Stillwell’s MyPersonality app, on FaceBook. The app captured people’s responses to the Big 5 (OCEAN) inventory, and other psychometric tests, along with the users’ FaceBook profile (if the user accepted).

It is notoriously hard to get people to engage in psychological studies. I struggled for eight weeks to get 121 participants for my study (two were eventually excluded due to age). Studies with hundreds of participants are common, but studies with thousands, much less so (Haidt’s MFQ work has many thousands, also by virtue of being on the internet). Stillwell’s app captured millions.

Stillwell and Kosinski (and others), with an embarrassment of data -both traditionally psychometric, but also things like FaceBook ‘Likes’ – started to find correlations between traits (long supposed to be predictive of behaviour) and FaceBook likes (records of behaviour and behavioral preferences). For example, liking Wu-Tang Clan was a strong predictor of heterosexuality; liking Lady Gaga, extroversion, etc. With millions of data points, and some serious computing power, predictive models were developed and refined. To the point where a mere 68 ‘Likes’ (on average) could predict skin colour, sexual orientation, political preferences, religious affiliation, drug usage (both legal and illegal), and intellect of a given person (see figure 1).

Kosinski, et al. (2014)

Figure 1: Prediction accuracy for dichotomized attributes (from Kosinski, Stillwell & Graepel, 2013).[1]

The strength of the model depended on how well it could predict a test subject’s answers. Kosinski kept working at it. Soon, with a mere ten “likes” as input his model could appraise a person’s character better than an average coworker. With seventy, it could “know” a subject better than a friend; with 150 likes, better than their parents. With 300 likes, Kosinski’s model could predict a subject’s answers better than their partner. With even more likes it could exceed what a person thinks they know about themselves [Youyou, Kosinski & Stillwell, 2014[2]].

The day he published these findings, Kosinski received two phonecalls. One was a threat to sue, the other a job offer. Both were from Facebook.

Facebook stopped ‘Likes’ from being default viewable by everyone, but as apps are obliged to get your permission to view this information (and Kosinski had been doing so, even when the data was readily viewable), this was a fairly pointless exercise.

…using all this data, psychological profiles can not only be constructed, but they can also be sought and found. For example if you’re looking for worried fathers, or angry introverts, or undecided Democrats. What Kosinski had invented was essentially a search engine for people. He has been getting more and more acutely aware of both the potential and the danger his work presents.

It is somewhat surprising that Google weren’t interested. Then again, given the controversy over ‘Search Bubbles’ (aka Google Personalized Search, see also Eli Pariser, at TED, on Filter Bubbles, in 2011), and their oft-proclaimed ethos of “Do no evil,” maybe discretion was the better part.

Cambridge Analytica

…in November 2015, the more radical of the two Brexit campaigns (, supported by Nigel Farage) announced that they had contracted with a Big Data firm for online marketing support: Cambridge Analytica. The core expertise of this company: innovative political marketing, so-called microtargeting, by measuring people’s personality from their digital footprints based on the Ocean model.

Then, in June 2016, it was announced that Trump had also hired Cambridge Analytica, the engineers of Ted Cruz’s meteoric rise – from 40% recognition, and 5% support, to widespread recognition and 35% support. Head of Cambridge Analytica, Alexander Nix, wasn’t shy about publicising their hand in Cruz’s rise at the Concordia Summit, but was a little more circumspect about making it known that they were now working with Donald Trump.

YouTube video

Video: Alexander Nix detailing Cambridge Analytica’s involvement in the Cruz Primary campaign (the video is 11 minutes well spent if you want to understand what’s going on).

“We have profiled the personality of every adult in the United States of America—220 million people,” Nix boasted in an interview with Das Magazin. And all the evidence suggests that they deployed this powerful data set politically.


On the one hand, Trump is self-contradictory, denying his own words, and seeming to hold multiple positions on the same topic. On the other hand, an electorate can’t keep track of all of the soundbites, and any sense of continuity, and thus the integrity of the message (let alone the person). This combination of messenger and message leaves people vulnerable to Cambridge Analytica’s approach. Targeting the specific Trump message that suits the personality profile blocks out the contradictory messages that happen to come from the same messenger, that in any normal election would be suicide (see Mitt “Flip-Flop” Romney).

This is effectively parasitizing confirmation bias, and it is a threat to free will, in the fine tradition of entrapment, gaslighting, and duress.

Mathematician, Cathy O’Neil, apparently unaware of Cambridge Analytica’s influence at the time, had this to say about Trump’s apparent flip-flopping:

…he’s got biased training data, because the people at his rallies are a particular type of weirdo. That’s one reason he consistently ends up saying things that totally fly within his training set – people at rallies – but rub the rest of the world the wrong way.

Next, because he doesn’t have any actual beliefs, his policy ideas are by construction vague. When he’s forced to say more, he makes them benefit himself, naturally, because he’s also selfish. He’s also entirely willing to switch sides on an issue if the crowd at his rallies seem to enjoy that.

In that sense he’s perfectly objective, as in morally neutral. He just follows the numbers. He could be replaced by a robot that acts on a machine learning algorithm with a bad definition of success – or in his case, a penalty for boringness – and with extremely biased data.

Compare that with what Cambridge Analytica were actually doing with Trump:

…starting in July 2016, a new app was prepared for Trump campaign canvassers with which they could find out the political orientation and personality profile of a particular house’s residents in advance. If the Trump people ring a doorbell, it’s only the doorbell of someone the app has identified as receptive to his messages, and the canvassers can base their line of attack on personality-specific conversation guides also provided by the app. Then they enter a subject’s reactions to certain messaging back into the app, from where this new data flows back to the dashboards of the Trump campaign.

Cambridge Analytica split the US population into 32 distinct personality profiles – compare this with Myers-Briggs, which has 16. Several commenters on this blog have pointed out Clinton’s failure to goto Michigan and Wisconsin, so it’s interesting to note that Cambridge Analytica explicitly targeted those states but, more explicitly, targeted men who love American-made cars. I don’t know that Clinton could have made a plausible appeal to this population.

The Future

Steve Bannon is now on the board of Cambridge Analytica, and they are in talks with people in Switzerland and Germany. Marin Le Pen’s niece, herself a right wing activist is interested in collaboration. So if the recent groundswell of right wing populism around Europe and the world isn’t already due to such activity, it soon will be.

In a soon-to-be-published paper Kosinski found that:

…marketers can attract up to 63% more clicks and up to 1400% more conversions in real-life advertising campaigns on Facebook when matching products and marketing messages to consumers’ personality characteristics. They further demonstrate the scalability of personality targeting by showing that the majority of Facebook Pages promoting products or brands are affected by personality and that large numbers of consumers can be accurately targeted based on a single Facebook Page.

As I have already said, this is an abuse of Free Will, it is the appealing to the intuitive rather than the rational, and recall that the rational is supposed to monitor the intuitive to stop it from doing anything stupid, but the rational brain cannot deal with information overload.


I can’t end this piece much more succinctly than the original article:

The world has been turned upside down. The Brits are leaving the EU; Trump rules America. And in Stanford the Polish researcher Michal Kosinski, who indeed tried to warn of the danger of using psychological targeting in a political setting, is still getting accusatory emails. “No,” says Kosinski quietly, shaking his head, “this is not my fault. I did not build the bomb. I just showed that it was there.

There is no small parallel to Oppenheimer’s utterance:

YouTube video


The Concordia summit, from which the video of Alexander Nix comes, has some things on its website that cause me to question whether Nix should have been there:

In the summer of 2011, friends and business partners Nick Logothetis and Matthew Swift had an idea for convening thought leaders in the midst of the 10th anniversary of 9/11 to discuss the importance of partnerships in <b>combatting extremism</b>. Since then, Concordia has grown towards a belief that P3s (Private-Public Partnerships) are a fundamental tool to addressing many societal challenges. Concordia recognizes that cross-sector collaboration offers effective solutions. [emphasis mine]

[1] Kosinski, M., Stillwell, D., & Graepel, T. (2013). Private traits and attributes are predictable from digital records of human behavior. Proceedings of the National Academy of Sciences, 110(15), 5802-5805.

[2] Youyou, W., Kosinski, M., & Stillwell, D. (2015). Computer-based personality judgments are more accurate than those made by humans. Proceedings of the National Academy of Sciences, 112(4), 1036-1040.

Notify of
Inline Feedbacks
View all comments