Written by Paul Eccleson on Monday 14 August, 2023
Researchers at the website Data Colada are on a mission to clean up psychological science. In a recent series of posts,[1] they alleged that four studies involving Harvard Business School professor Francesca Gino used falsified data to produce positive findings. In an ironic twist, some of the experiments were on dishonesty and ethics.
The accusations have led some to question whether the field of behavioural science can be trusted. Given that governments and regulators across the globe are using the science as the basis of policy and rule setting, such accusations have serious implications for healthcare, financial markets, law enforcement and the judiciary.
It is worth taking a closer look at Data Colada’s claims and the weaknesses in behavioural science generally. We need to ask whether we should be concerned about the application of psychology to risk and compliance.
Colada’s claims
In August 2021 the Data Colada team disclosed that they believed data in an influential behavioural science paper was falsified. The landmark 2012 study was undertaken by Lisa L. Shu, Nina Mazar, Francesca Gino, Dan Ariely and Max H. Bazerman.
There were three studies reported in the paper. The third study used data from a car insurance company to conclude that signing an honesty declaration at the front of a claim form makes individuals less likely to lie about the mileage they drove. However, the base mileage data used to judge dishonesty was suspicious. One would expect the mileage driven by a large number of people to be extremely varied. A small number of people won’t drive much, a small number will drive vast distances, and most will drive an average distance.
But what the Data Colada team discovered was that the base data showed a uniform pattern of mileage with a sharp cut off at 50,000 miles. In the 2012 study, the distances driven by individuals didn’t vary at all. An equal number of people seemed to drive across a range of distances – but never beyond 50,000 miles. This simply didn’t make sense. There were other indicators of falsification identified by the reviewers, such as suspiciously precise reported mileages (e.g., 1743 miles rather than 1750 – people tend to round mileage when they self-report) and changing fonts indicating data had been copied poorly.
The latest Data Colada analysis identifies similar red flags in the data used in papers co-authored by Francesca Gino. They claim the following.
- Revisiting the Shu, et al. paper, Study 1 showed signs that data indicating a neutral result was deliberately manipulated to conclude that signing at the top of the form encouraged honesty. The Data Colada team used a little-known feature of Excel to show that data had been moved manually to drive the positive finding. The manipulation had, this time, been undertaken by a different researcher. In their words: ‘Two different people independently faked data for two different studies in a paper about dishonesty’.
- In another study, Gino, et al. (2015) hypothesized that students forced to argue against their own beliefs would feel a little ‘grubby’. As a consequence, they would have more favourable opinions regarding cleaning products (in order to reduce that uncomfortable feeling). Once again, Data Colada found suspicious data. The subjects of the experiment were asked in which year of their studies they were currently at. Some 20 respondents replied with the inappropriate answer ‘Harvard’. In its own right, this is a red flag for data manipulation. However, these 20 responses wholly accounted for the positive result of the experiment, suggesting the data had been fabricated.
- Gino and Wiltermuth (2014) shows similar manipulation. In this paper, 13 data entries were moved from a negative result to a positive result to support the hypothesis that cheating was correlated with creativity.
- Gino, et al. (2020) also shows evidence of tampering. Despite subjects offering a score indicating that they found networking events stressful and distressing, verbal comments recorded in the same form by the same subjects seemed universally positive. The Data Colada team concluded that an experimenter had changed the scores but not the words. Only the scores were used in the experiment; the words were ignored. Once again, the scores with the mismatched words accounted for the positive result. Without them there would be no result at all.
Data Colada is at pains to make clear that their accusations are aimed solely at Francesca Gino and Dan Ariely, not their co-researchers. Francesca Gino is on leave from Harvard University and all articles mentioned above have either been retracted or have retraction requests associated with them.
Psychology
Psychology as a discipline has faced a number of methodological challenges throughout its history. Sir Cyril Burt, for instance, was an eminent psychologist whose experiments were used to promote the theory that intelligence was inheritable. After his death it was claimed that he not only falsified data from identical twins who never existed, he also invented two co-researchers he claimed had worked with him. His data and methods were declared untrustworthy, but Burt’s work was used as justification for the UK’s ‘11-plus’ examination – a selection test, the results of which were used to channel children into academic or technical schooling around the age of 11.[2]
Psychological science has also been criticised for other failings, which can be grouped under the following broad headings.
- Cultural bias. Psychology is primarily based on experiments involving White, European subjects with little regard to other cultural contexts.
- The ‘replication crisis’. The findings of some studies have been impossible to repeat.
- Publication bias. Practitioners are under pressure to publish often and only those with positive results reach the light of day.
- i-Frame not s-frame. It has been suggested that ‘nudges’ often only produce small effects and that the focus of the science on changing individual behaviour is less effective than policies aimed at changing societal rules.
- Oversimplification. Concepts such as ‘system 1’ (instinctive, subconscious thinking) vs. ‘system 2’ (cognitive, conscious reasoning) are an overly simplified characterisation of the full complexity of human cognition.
There can be no doubt that psychology has had its instances of fraud, misrepresentation and over generalisation. Yet the same is true of other, more ‘concrete’ sciences. When genetic scientists at University College London were found to have fabricated results over many years, no one demanded a rejection of the entire field of genetics.[3]
Any science is an inherently social process, subject to the same human biases, flaws and predispositions as any activity. This is a reason to be cautious and thorough when applying scientific findings, but it is not a reason to reject it outright. Careful consideration, and much experimentation, is required when deploying interventions. There are many successful examples of the application of behavioural science to significant human problems. That the science might sometimes be problematic should not lead us to reject the positive solutions it can offer.
[1] See http://datacolada.org/ for details of the studies and more on the claims.
[2] The Wikipedia entry for Cyril Burt is a good starting point for understanding the controversy: https://en.wikipedia.org/wiki/Cyril_Burt – accessed July 2023
[3] Ian Sample, ‘Top geneticist ‘should resign’ over his team’s laboratory fraud’, The Guardian, 1 February 2020: https://www.theguardian.com/education/2020/feb/01/david-latchman-geneticist-should-resign-over-his-team-science-fraud – accessed July 2023