Home Science How decades of bad science created a mass of urban myths, and...

Science

How decades of bad science created a mass of urban myths, and what we can trust now – iNews

May 26, 2022

120

This is Geek Week, my newsletter about whatever nerdy things have happened to catch my eye over the past seven days. Here’s me, musing about something I don’t fully understand in an attempt to get my head around it: I imagine that’s how most editions will be. If you’d like to get this direct to your inbox, every single week, you can sign up here.

Back in 2001, the papers reported on a study into football players. I remember seeing it on the news. “Scientists kick into touch soccer belief in weak foot,” was the Guardian’s headline. “Footballer’s favoured foot all in the mind,” said New Scientist.

The study looked at footage from the 1998 World Cup, and watched 236 players pass the ball. And it noticed that although players often tried to get the ball to their “stronger” foot, when they couldn’t and passed with their “weaker” one, the passes reached their target just as often. “Sports commentators often excuse poor shots by saying they were hit with a player’s weak foot, but it appears there is no such thing,” one of the authors of the study was quoted as saying.

That study has stuck with me for 20 years because it is just so obviously not true. Anyone who’s played or watched any football at all is aware that players have a weak foot and a strong foot. There are players who are better or worse with their weaker foot (James Milner’s two-footedness, like many things about James Milner, is underrated; I genuinely don’t know which foot was Zidane’s “preferred” one, although I suppose could look it up; Rivaldo used his right foot to stand on, and that was about it). But the idea that a “weaker foot” is a myth seems basically insane to me.

Of course, the fact that I find something deeply counterintuitive doesn’t mean it’s not true. Lots of things in science are deeply counterintuitive, and yet are unarguably the case. “The faster you move, the more slowly you age” is just weird. “Observing a particle changes its behaviour” makes no sense. But those statements are true – the first is at the heart of Einstein’s relativity theory, and the second is central to quantum mechanics – and we can show it.

The trouble is, some areas of science – psychology, most famously – became so interested in finding counterintuitive results that they forgot that they also had to be true.

Will you still need me, will you still feed me

You may know about the “replication crisis” that’s rocked science for the last decade or so. In case you don’t, the basic gist of it is that suddenly, psychologists realised that the bog-standard statistical practices that they’d been following since forever were essentially designed to discover false results.

Say you’re looking into whether or not jelly beans cause acne. (I’m stealing this example from the webcomic XKCD.) So you test whether there’s a correlation between eating jelly beans and developing acne.

You find there isn’t. So you chop the data up into smaller chunks: you look at each individual flavour of jelly bean.

You can normally only publish results that are “statistically significant”: that is, if the effect was not real, you’d only expect to see results as unlikely as that one time in 20 or less.

(Very important note: that does not mean “there’s only a one-in-20 chance that the result is a fluke”. I had a go at explaining why here, or you can read the chapter on statistical significance in my last book. Most scientists and indeed a distressing number of stats textbooks get this wrong.)

There are 20 different flavours of jelly bean. You examine each one, and you find a correlation between green jelly beans and acne. So you publish it in Nature, become famous, and grow rich off the TED talks and pop-science book contracts.

But (as you can probably see) all you’ve done there is give yourself 20 chances to get a one-in-20 coincidence. It’s literally the same as rolling a 20-sided die 20 times and then acting surprised when you roll a 20.

This is exactly what psychological science was like for a long time, and too often still is. They also did other things, like coming up with hypotheses after they’d collected the data or not publishing negative results. These apparently innocuous behaviours make it much more likely that the results of any given published paper you read will be false.

Then, in 2011, people noticed, for several reasons. One reason was that Daryl Bem, a psychologist, published a study that (using completely bog-standard statistical techniques) “proved” that psychic powers were real. Another was that some psychologists, again using entirely normal statistical techniques – but this time to make a point, rather than being serious – “proved” that if you listened to When I’m 64 by the Beatles, you literally got younger, as in chronologically younger, as in your birthday got more recent. Which, obviously, can’t be true.

There were other things as well, but they’re the two most interesting. That and the other things drove psychologists to start looking at studies published in the past, and to try to “replicate” them – see if doing the same study again would get the same result. And, long story short, they didn’t.

Blink and you’ll misunderstand it

Brian Nosek, an American psychologist, led a project to replicate 200 earlier studies and found that only about 50 of them replicated. The ones that did found much smaller results.

And the victims were all the things you’ve read about in Malcolm Gladwell books or whatever for decades. A hugely famous 1988 study found that if you hold a pencil between your teeth (thus forcing yourself to smile), you’ll become happier. Didn’t replicate. Everyone got very excited about “power posing”: didn’t replicate (despite the associated TED Talk’s 64 million views). The field of “social priming”, where brief subconscious cues changed behaviour in dramatic ways (words associated with money make us greedier, words associated with age make us slower and weaker, the whole concept of “subliminal advertising”) basically fell apart.

You may think this doesn’t matter, but it does. Quite apart from books like Blink being full of what turns out to be nonsense, and widely believed concepts like the Dunning-Kruger effect, the Stanford Prison Experiment, or the marshmallow test being either not real or much less important than claimed, these ideas have made it into serious policymaking.

“Growth mindset”, the idea that saying things like “you worked hard” rather than “you are smart” to children makes them more willing to learn, has become so widely accepted that my primary-school-aged kids have to make posters of it in class, but it’s far from clear that it’s real. I’ve written a piece for i’s comment pages this week about the “implicit association test” and unconscious bias training, and how staff in government and businesses are often required to undergo them, but they’re built on sand.

It’s now become a sort of hobby of mine, waiting for psychology studies that I’ve cited or ideas I’ve believed to go through the replication mill and come out broken.

The most recent – and the reason I’ve written this piece, so perhaps I ought to have mentioned it earlier – is this one. Maybe you remember all those studies finding that men are more likely to tip waitresses who are ovulating, and so on? Well, turns out they’re probably nonsense too.

Knowing who to trust

I brought up that football study at the beginning because (as far as I know) no one’s tried to replicate it, but I confidently predict that if someone did, they’d find it wasn’t true. Or, perhaps more likely, they’d find some reason why the headline claim (“Football players don’t have a weaker foot”) isn’t backed up by reality, and actually it’s caused by players only attempting easier passes with their left, or something.

But there’s a real problem here. How do we know what studies to trust? It’s genuinely difficult. We can say “We’ll only believe the studies that tell us things we already think are true,” but that renders science largely pointless. We can say “We’ll trust everything until the replication attempt fails,” but then you’ll end up spending years believing nonsense.

I think the only realistic path is to use your judgement as carefully as you can, and to take into account how likely you think a hypothesis is beforehand, how good the study seems to you, and whether you trust the authors and journal involved. And of course if something’s replicated several times, or if it’s in line with existing findings, then you should be more confident. But “have really good judgement based on years of experience looking at other scientific experiments” isn’t a very useful piece of advice – the equivalent of a football coach telling his players to be better than they are – and understandably, people don’t like the idea that science is a subjective judgement-based process, even though it is and has to be.

The alternative is to outsource your judgement to other people. As a non-scientist that’s often what I do – I phone up scientists I trust and ask them whether something is likely to be true. And you might want to try this quiz – which of these scientific papers replicated? If you find you do pretty well, you might be more trusting of your own judgement. (I’ll just boast that I do pretty well at it, which to be fair you’d bloody hope, given my job.)

Self-promotion corner

I’ve written about monkeypox this week. The long and the short of it is: no, it’s not going to be another Covid; it appears to be driven by some superspreader events at gay clubs and festivals, and the risk to the general public is low; that said, the risk to certain groups of gay men is reasonably high, and it’s important to get the information to those groups without stigmatising them or suggesting that it’s a “gay disease” or anything.

Nerdy blogpost of the week: We Walk Among You

Freddie deBoer has written some really interesting things about mental health, and about how our modern idea of “awareness” of mental illness always focuses on the sympathetic, unobtrusive forms – anxiety and depression, the importance of “self-care” and so on. (Photo by Universal Images Group via Getty Images)

Freddie deBoer is a wonderful blogger, especially on the topic of education. He also has a bit of a chequered past: a few years ago, in the grip of a psychotic episode, he publicly accused someone of some awful crimes.

He’s accepted responsibility for that, and has never tried to blame his illness. But he has written some really interesting things about mental health, and about how our modern idea of “awareness” of mental illness always focuses on the sympathetic, unobtrusive forms – anxiety and depression, the importance of “self-care” and so on.

But as a friend of mine with a psychosis-related disorder once pointed out, we are rarely comfortable talking about the destructive side of mental illness, how people with mental health issues can find themselves doing dreadful things and hurting those around them.

DeBoer’s post “We Walk Among You” talks about that. It’s not an easy read, but it’s important:

This is the moral poverty of this view of mental illness; it is “accepting” of it only through the mechanism of deracinating it from genuine, ugly, human pathology. This is the emotional violence of romanticism and these are the wages of a Manichean moral culture. Like a shitty 1990s movie where a character who’s somehow different (deaf, perhaps, or suffering from cognitive disability) is thus also made unthreatening, sexless, harmless, our culture can now only conceive of the mentally ill as entirely blameless, because as a people we do not now have the humility and the wisdom to treat those who are different with the sober and thoughtful inquiries of adult judgment. The mentally ill can’t be understood as agency-laden adults who suffer under constraints that are hard to understand. They must be exonerated from all blame like children. Which means that, in your eyes, those who you are dedicated to blaming cannot be mentally ill. To blame and exonerate someone at the same time is complicated, unsatisfying. And nothing now is allowed to be complicated.

This is Geek Week, my newsletter about whatever nerdy things have happened to catch my eye over the past seven days. Here’s me, musing about something I don’t fully understand in an attempt to get my head around it: I imagine that’s how most editions will be. If you’d like to get this direct to your inbox, every single week, you can sign up here.