by Andrew Leigh
One visa lottery study looked at workers at an Indian software company who moved to work in software companies in the United States.28 Compared with unsuccessful applicants, those who won the visa lottery increased their earnings sixfold. Another analysis found that Tongans who won the visa lottery to move to New Zealand virtually quadrupled their earnings.29 But victory had a price. The study monitored the extended families of successful and unsuccessful lottery applicants. The researchers found that winning the visa lottery was bad luck for those family members who stayed behind in Tonga. Migrants tended to be the breadwinners, so when they moved to New Zealand, incomes in their former Tongan households declined.
Importantly, the researchers were also able to compare the impact from the randomised experiment with what a naive analysis would have estimated. Comparing applicants with non-applicants – a common research strategy – would have produced the mistaken conclusion that migration reduced poverty among those Tongan families who sent a migrant to New Zealand. In other words, a study based on non-randomised data would have got the result exactly backwards.
Lotteries have also taught us something about religious observance. Every year, over a million Muslims perform the Hajj pilgrimage in Saudi Arabia. Participants often claim that travelling to Mecca fosters unity, but critics have feared that it could increase hatred towards non-Muslims. So a team of economists surveyed Pakistanis who had applied to that country’s Hajj lottery.30 Compared with those who missed out, lottery winners were more committed to peace, more accepting of other religions and had more favourable attitudes towards women. Pakistani pilgrims became more devout, but they were also more tolerant – most likely because the Hajj exposed them to people from around the globe.
Another simple form of randomised trial comes from randomised audits. In the United States, randomised taxpayer audits have been conducted since 1963, as a means of better targeting the Internal Revenue Service’s compliance strategies.31 The philosophy behind them is that tax avoidance has a habit of avoiding the eyes of the authorities. Rather than relying solely on informants and experts, a random audit looks at the level of compliance across the community, by randomly selecting something like 1 in 2000 returns for an in-depth check. The randomised audit approach embodies a sense of modesty, recognising that tax cheats can be hard to find. It’s a bit like asking your friends to help look for your lost car keys: a fresh perspective can bring useful insights.
Randomised audits are politically controversial, which is why they have been discontinued in Australia, Sweden and (for a time) the United States.32 But they aren’t just about improving the system so we find more wrongdoers. They’re also about reducing the number of times that the tax authorities chase someone up only to find they’ve done nothing wrong. One study estimates that better targeting compliance efforts will avoid tens of thousands of people being contacted by the tax authorities.33
For researchers, randomised audits are the best way of answering the question: who is most likely to underreport their income to the authorities? A recent study finds that taxpayers in the top 1 per cent failed to report 17 per cent of their true income, while low- and middle-income taxpayers omitted 4 per cent of their true income.34 Only through a randomised audit study was it possible to uncover the fact that income underreporting was four times worse among the richest taxpayers.
Another straightforward use of audits occurs in Brazil, where the federal government randomly audits a sample of municipal governments to check their use of federal funds.35 The audits revealed that nearly one-third of all federal funds were lost to corruption. Over time, municipal government audits have increased the numbers of mayors convicted for misusing their office, and reduced the incidence of corruption. Municipal audits are so popular among the Brazilian public that they are conducted alongside the national lottery.
*
Simple randomised trials can teach us a great deal about the world – but we have to be careful about experiments that are done in settings that are nothing like the real world. As Chicago economist Frank Knight once put it, ‘The existence of a problem in knowledge depends on the future being different from the past, while the possibility of a solution of the problem depends on the future being like the past.’36
In this book, I’ve focused mostly on randomised experiments that test real-world practices – ranging from knee surgery to Drug Courts. By contrast, I haven’t spent much time discussing the experiments that take place in scientific laboratories, where randomisation is intrinsic to the research. The people who play with test tubes may not think of themselves as randomistas – but in many cases, that’s exactly what they’re doing.
But a more controversial group of experimenters are those doing ‘lab experiments’ in social science. This often involves recruiting university students to answer hypothetical questions and play computer games. Rather less exciting than the games you might try on your Xbox or PlayStation, laboratory games are designed by social scientists to test behaviour in hypothetical settings.
Discussing the spectrum of experiments, economists Glenn Harrison and John List lay out four categories of randomised experiments.37 First, natural field experiments, of the kind we’ve discussed in this book, where subjects are doing tasks they would normally do, and often don’t know they are in an experiment. Natural field experiments could involve anything from providing support to disadvantaged students to changing the wording of a marketing letter.
Second, framed field experiments, where people know they are in an experiment, but the setting or commodity is natural. In one experiment, researchers set up a table at a sports card convention, and tested how collectors behaved in response to auctions for desirable sports cards.38 Another framed field experiment, run in Sweden in the early 1970s, tested how much people were willing to pay in order to be the first to watch a brand-new television program.39
Third, artefactual field experiments. These are generally conducted in a university computer lab, but not with university students. Artefactual field experiments might involve games to test attitudes to risk among share market traders, or trustworthiness in the general population.
Fourth, conventional laboratory experiments. The typical laboratory experiment is run on university students, with an imposed set of rules and abstract framing. Conventional lab experiments include games that are used to test economic theories about fairness, altruism and attitudes to inequality.
From the researcher’s standpoint, conventional laboratory experiments are simple to implement. Students are readily recruited – psychology students are often required to take part in experiments as a condition of their enrolment – and are generally paid a fairly low rate per hour. One of the downsides of this kind of simple study was summed up by Harvard psychologist Stephen Pinker, who once quipped: ‘When psychologists say “most people”, they usually mean “most of the two dozen sophomores who filled out a questionnaire for beer money”.’40
Those students who volunteer for experiments are likely to be different from the student population at large. According to one comparison, students who sign up for lab experiments tend to spend more, work less volunteer more, and to have an academic interest in the subject of the experiment.41 Additionally, students differ from people who are midway through their careers. For example, a study in Costa Rica had students and chief executives play the same set of trust games.42 The business chiefs were significantly more trustworthy than the students.
A major concern of the laboratory setting is that its results will not generalise into the real world. When a professor is setting the rules of the game, people may react differently than in real life. For example, one lab experiment found that student subjects who had never given a cent to charity gave 75 per cent of their endowment to the charity in the lab experiment.43 It’s risky to draw broad conclusions from experiments that see real-life Scrooges becoming Oskar Schindlers the moment they enter the laboratory.
Admittedly, it’s not always true that what happens in the
lab stays in the lab. One psychology experiment tested whether it was possible to artificially generate interpersonal closeness.44 University students were randomly paired up. Half the pairs were asked to make small talk, while the other half were given questions designed to build intimacy, such as ‘What does friendship mean to you?’, ‘For what do you feel most grateful?’ and ‘What one item would you save if your house was burning?’ The intimacy exercise worked so well that one pair of participants got married.
Like science experiments, we can get the best from social science laboratory experiments if we recognise their limitations. Recall that nine out of ten pharmaceuticals that work in the science lab fail to get approved for use on the general public. Similarly, we should generally regard social science laboratory experiments – whether run by psychologists or economists – as promising but not definitive.
*
Not all simple randomised experiments are good, but plenty of good randomised experiments can be simple. Economist Uri Gneezy tells the story of running an experiment to help a California vineyard owner decide how much to charge for his wines.45 Gneezy and the business owner selected a cabernet that usually sold for $10. They then printed three versions of the cellar door price list, with the cabernet variously priced at $10, $20 or $40. Each day over the next few weeks, the winery randomly chose one of the three price lists. It turned out that the cabernet sold nearly 50 per cent more when priced at $20 than $10. The experiment took a few minutes to design and a few weeks to implement, but it boosted the winery’s profits by 11 per cent. A full-bodied result with aromas of easy money – consume now.
In this chapter, I’ve given a taste of how you can run low-fuss randomised experiments in your personal life and within your own organisation. I hope it’s encouraged you to think about becoming a randomista. If you do, let me know how it goes.
At a larger scale, we need to dispel the myth that all randomised trials need to be expensive and long-lasting. As Nudge Units have shown, tweaks can produce big gains. Such trials remind me of one of the maxims of writer Tim Ferriss: ‘If this were easy, what would it look like?’ Too often, we overcomplicate things by envisaging perfect schemes that are too hard to implement in practice. Like the Occam’s razor principle in science, simple randomised trials can often teach us a great deal.
Unfortunately, some critics of randomised trials often take the same approach, shooting down randomised evaluation with a single riposte: ‘It’s unethical.’ So let’s now take a serious look at the ethical side of randomised trials and consider how we can ensure that randomised trials do as much good – and as little harm – as humanly possible.
11
BUILDING A BETTER FEEDBACK LOOP
As a young man, Luke Rhinehart decided he would start making decisions based on a dice roll. Periodically, he would write out a numbered list of possible things to do, roll a die and then follow its lead. Driving past a hospital one day, he saw two beautiful nurses walking along the side of the road. Feeling shy, Rhinehart drove on, but then decided that if he rolled an odd number, he would go back. The dice came up with 3. Rhinehart turned the car around, stopped next to the women and introduced himself. He gave them a lift, they arranged to play tennis the next day, and he ended up marrying one of the women, Ann.1
Rhinehart couldn’t let go of the idea of making decisions based on rolling dice, so he began writing novels in which characters took randomisation too far. His most famous book, The Dice Man, was named the ‘novel of the century’ in 1999 by Loaded magazine. The main character does everything his dice tell him to. He ends up committing murder, arranging a breakout by psychiatric patients and organising a debauched ‘dice party’.
Rhinehart was not the first writer to tackle randomisation in fiction. In 1941 Jorge Luis Borges wrote a short story titled ‘The Lottery in Babylon’, in which every major decision in the mythical society of Babylon is determined by a lottery. The lottery decides whether someone becomes an official or a slave, and whether an accused person is declared innocent or sentenced to death. Babylon, wrote Borges, ‘is nothing but an infinite game of chance’.2
Borges’ world may seem farfetched, but luck plays a significant role in real-life politics too.3 In some cases, luck is even built into the system, as with the ancient Athenians, who chose each day’s ruler by lottery, using a stone machine called the kleroterion (their equivalent of a modern-day Powerball machine). Because each person’s rule only lasted a day, it is estimated that a quarter of Athenian citizens ruled the city-state at some point. Random election systems – known as ‘the brevia’ and ‘the scrutiny’ – were also used in late medieval and Renaissance Italy. They survive today in the modern jury, in which criminal defendants are judged by a randomly selected group of their peers.
If you’ve ever tossed a coin when faced with a major choice, you’ll know that leaving a decision to chance can be liberating. A few years ago, economist Steven Levitt set up a website to explore this concept.4 People standing at a fork in the road were invited to have luck determine which path they should take. Agonising over a life choice? You simply told Levitt’s website your two options. It then tossed a coin and told you what to do. Six months later, these people were surveyed about their happiness. Over 20,000 people took the challenge, and nearly two-thirds did what the coin said to do.
To test the impact, Levitt used a standard life satisfaction survey, which asks people to rate their happiness on a scale from 1 (miserable) to 10 (ecstatic). Try it now. If you said 7 or 8, then you’re in the middle of the pack: about half the population in advanced nations give one of those answers. Another quarter say they are 6 or sadder, while the remaining quarter report their happiness as a 9 or 10. So even although it’s a ten-point scale, most of us are separated by just a few points. Because Levitt’s study was a randomised trial, he could be sure that any difference in happiness between the heads group and the tails group was due solely to the coin toss.
For unimportant decisions – such as growing a beard or signing up for a fun run – the choice didn’t matter much. But making a more significant life change – such as moving house or starting a business – led to a two-point increase in the happiness scale. A common question was whether to end a romantic relationship. Among these people, breaking up led to nearly a three-point increase in happiness. Quitting a job led to a massive five-point increase: the equivalent of shifting from glum to gleeful. ‘Winners never quit, and quitters never win’ is bad life advice. In the happiness stakes, quitters win.
Levitt’s website refused to help people make decisions containing words like ‘murder’, ‘steal’ and ‘suicide’.5 But it did shape some major life decisions. The toss of the coin led to about a hundred additional romantic relationships breaking up, but also to about a hundred couples staying together who might otherwise have split.6 It would have been immoral to force people to quit their jobs or divorce their spouses, but tossing a coin for someone who is on the fence is something that many people would do for a friend.
If you ask critics why they don’t like randomised trials, one of the most common responses is that control groups are unethical. Already, we’ve touched on some real-life examples of this. Ethics panels have approved sham surgery on the basis that the clinical effectiveness of many surgical procedures is uncertain. Randomised trials now suggest that popular procedures like knee surgery for torn cartilage may be ineffective. Likewise, randomised trials to reduce crime have seen the treatment group do better in the Restorative Justice and Drug Court experiments, while the control group did better in the Scared Straight and Neighbourhood Watch experiments. In development economics, the success of village schools in Afghanistan surprised many of the country’s education experts. If we knew for sure that it was better to be in the treatment group, then we wouldn’t be doing the trial in the first place.
Most medical trials operate on the principle of informed consent: patients must agree to participate in the research. But not every study works this way. From 2005
to 2011, researchers in Sydney conducted a trial in which not a single patient consented – because all of them were either unconscious or suffering from severe head trauma. One in three would be dead within a month.
The Head Injury Retrieval Trial was the brainchild of neurosurgeon Alan Garner. He wanted to test whether patients were more likely to recover from a major head injury if they were treated by a trauma physician rather than a paramedic. Because the physician has to be transported to the scene by helicopter, the study aimed to test whether society could justify the extra expense of sending out a physician. If the outcomes from those treated by physicians weren’t better, the extra money would be better spent in other parts of the health-care system.
Garner’s study worked with the emergency call system in Sydney. When operators received a call reporting a serious head injury, a computer performed the electronic equivalent of tossing a coin. Heads, the patient got an ambulance and a paramedic; tails, they got a helicopter and a trauma physician.
In 2008, halfway through the trial, I interviewed Garner, who told me that although he had spent much of his career thinking about the issue, he didn’t know what to expect from the results.7 ‘We think this will work,’ Garner told me, ‘but so far, we’ve only got data from cohort studies.’ He admitted that, ‘like any medical intervention, there is even a possibility that sending a doctor will make things worse. I don’t think that’s the case, but [until the trial ends] I don’t have good evidence either way.’