Randomistas

Home > Nonfiction > Randomistas > Page 17
Randomistas Page 17

by Andrew Leigh


  Now, you might think that a politician telling you about political experiments is a bit like a pet rat saying that its cage could really use some mazes. Some of my parliamentary colleagues might raise an eyebrow at the prospect of encouraging researchers to experiment upon us. But the fact is that randomised studies have cast new light on how the political process operates. If politicians are providing worse services to minority race constituents and better services to donors, then the public has a right to know about it.

  *

  In 2008, as they went to the polls to overwhelmingly elect America’s first black president, California voters also supported a ballot measure striking down same-sex marriage. Gay and lesbian activists were shocked and upset. From defeat at the ballot box, community leaders began exploring the best way of reducing prejudice.

  Many ideas were floated, but the activists ultimately chose a strategy centred around honest, vulnerable conversations. Campaigners would go door-to-door, sharing their stories of discrimination, and inviting householders to do the same. The approach was known as ‘deep canvassing’, and aimed to break down discrimination by creating a greater sense of empathy towards gay and lesbian people. David Fleischer, who now directs the leadership program at the Los Angeles Lesbian, Gay, Bisexual and Transgender Center, says that it worked because it connected to people’s values: ‘When we were nonjudgmental and vulnerable with them and when we exchanged our lived experiences about marriage and gay people, that’s when we started changing people’s minds.’60

  At the end of 2014, deep canvassing received its strongest affirmation. An article in Science, one of the world’s top academic journals, reported that a randomised experiment in California had found that a twenty-minute conversation with a gay canvasser could shift attitudes towards same-sex marriage.61 The results were splashed across the media.

  In Florida, two young scholars, David Broockman and Joshua Kalla, were working on a similar study – this time exploring whether deep canvassing could change attitudes to transgender people. But when they looked closely at the data from the Science study, they found a range of irregularities.62 Eventually Broockman and Kalla concluded that one of the authors of the original study – a graduate student – had fabricated the results. His senior collaborator, Donald Green, asked Science to withdraw the paper, sorrowfully concluding: ‘There was no data, and no plausible way of getting the data.’63

  To Fleischer, who had campaigned for over thirty years on lesbian and gay rights, the news came as a shock. He personally phoned journalists who reported on the study to say that there was no longer evidence for deep canvassing. The graduate student on the study ‘had lied to us. Taken advantage of us. And I also wanted to point out to people we were not going to give up.’64

  With the California study now discredited, Broockman and Kalla found themselves working on the frontier of knowledge. Their Florida study was now the first randomised evaluation of deep canvassing. Three months after canvassers had gone door-to-door, households were surveyed by telephone. When Broockman analysed the data, he backed away from his screen, and said, ‘Wow, something’s really unique here.’65

  The Florida study not only confirmed the value of deep canvassing, it also showed results that were more powerful than the faked California results.66 One way of benchmarking the size of the effect is to compare it with the steady evolution in attitudes to transgender people that has taken place over time. Asked to rate their view of transgender people on a scale from 0 to 100, the average American’s attitude has warmed by 9 points over a fifteen-year period. But the campaigners had a bigger impact still: a single conversation with a canvasser caused Floridians’ attitudes to transgender people to warm by 10 points. Just a ten-minute personal conversation took householders more than fifteen years forward in their attitudes towards transgender people.

  The new research also suggested that social change had more to do with the style of the conversation than the background of the canvasser. While the bogus California research had purported to find effects only from gay campaigners, the Florida study showed that both transgender and non-transgender canvassers were able to shift attitudes. As Fleischer put it, ‘Our ability to change voters’ hearts and minds has been measured, this time for real.’67

  Not every story of academic misconduct has a happy ending, but one reason the fabrication came to light is that randomised experiments are, in their essence, extremely simple. That simplicity makes it easy to compare results across studies, and shows up questionable findings. Had the Californian analysis involved a bunch of clever statistical tricks, the fraud might never have been uncovered. Randomised trials are so simple that anyone can run one. In fact, let’s look at how you might do just that.

  10

  TREAT YOURSELF

  For a few days in July 2017, people searching Google for ‘randomised trial’, ‘A/B testing’ or ‘RCT’ might have seen an ad pop up in the sidebar. At the bottom of the ad were the words ‘A new book due in 2018’ and a link to my publisher’s website.

  But it was the first part of the ad that mattered. Web surfers were randomly shown one of twelve possible book titles, including Randomistas: Experiments That Shaped Our World, Randomistas: The Secret Power of A/B Tests and The Randomistas: How a Simple Test Shapes Our World. My editors and I each had our favourite titles, but we had agreed to leave the final decision to a randomised experiment. The medium that brought you cat videos, the ice bucket challenge and Kim Kardashian would choose this book’s title.

  A week later, over 4000 people had seen one of the advertisements, and we had a clear winner. People who saw Randomistas: How Radical Researchers Changed Our World, were more than twice as likely to click the ad as those who saw Randomistas: The Secret Power of Experiments. The worst performing title (not a single person clicked on it) was Randomistas: How a Powerful Tool Changed Our World. The experiment took about an hour to set up, and cost me $55.

  A few years earlier, I had written a book on inequality for the same publisher. My editor wanted to call it Fair Enough? My mother suggested Battlers and Billionaires. After running Google ads for a few days, we found that the click rate for my mother’s title was nearly three times higher. My editor graciously conceded that the evidence was in, and Battlers and Billionaires hit the shelves the following year.

  Were these experiments perfect? No way. Since I was trying to sell books, the ideal experiment would have randomised book covers – perhaps on Amazon or in a bookstore. But that would have taken more time and money than I had available. Figuring that people searching about a topic were sufficiently similar to people who would buy books about that same subject seemed a reasonable assumption.

  Anyone looking to run a better email campaign or redesign a website has dozens of online tools at their fingertips, including AB Tasty, Apptimize, ChangeAgain, Clickthroo, Kameleoon, Optimizely, SiteSpect and Webtrends. Retailers using the Amazon platform can even use Splitly, which randomly changes product descriptions and images. As we saw in Chapter 8, Amazon promised in 2000 never to run pricing experiments. But today, Splitly lets third-party retailers use the Amazon platform to randomly vary prices. The site claims to have generated nearly US$1 million in new sales through A/B testing on Amazon. Algorithms like these are one reason that Amazon’s prices fluctuate wildly. To see this, check out the website CamelCamelCamel. com, which shows past prices for Amazon products sold by third-party retailers. In the years 2014 to 2017, the best price of the game Classic Twister ranged from $3.48 to $49.80.1

  When I taught introductory economics at the Australian National University, I ran a small randomised experiment on my students, testing whether dressing more formally had any impact on their ratings of the lectures. Through the semester, I randomly varied the days that I chose to don a jacket and tie, versus wearing something a little less formal. At the end of each lecture, I asked all students to rate it from 1 to 5. Crunching the data after my final presentation, I found no evidence that my students preferred a talk de
livered in a tie. The lesson for lecturers: sweat the facts, not the fashion.

  Self-experimentation has a long tradition in medicine.2 To prove that surgery didn’t require a general anaesthetic, American surgeon Evan Kane injected himself with a local anaesthetic and then removed his own appendix. To prove that the polio vaccine was safe, Jonas Salk injected himself, and then his wife and children. To prove that the myxoma virus was fatal to rabbits but harmless to humans, Australian scientist Frank Fenner injected himself with enough of the virus to kill hundreds of rabbits. He was unharmed, though some Australians afterwards liked to call him ‘Bunny’.

  Some have pushed the limits even further. Addressing the 1983 Urological Association conference in Las Vegas, Giles Brindley announced to his audience that it was possible to produce an erection through direct injection. He then informed them that shortly before the lecture he had injected his penis with an erectile drug known as papaverine. On the screen, he showed slides of his penis in its flaccid state. Brindley assured the audience that no normal person would find giving a lecture to be an erotic experience. As one observer recalls, ‘He then summarily dropped his trousers and shorts, revealing a long, thin, clearly erect penis. There was not a sound in the room. Everyone had stopped breathing.’3

  Single-patient experiments can be randomised, if the treatment is turned on and off. These ‘single subject’ or ‘N of 1’ experiments are becoming increasingly common in the case of drugs that are developed for rare diseases, or tailored based on the genetics of the patient.4 One ongoing experiment of this kind is a Dutch trial of treatments for rare neuromuscular diseases, which can affect just 1 in 100,000 people.5 Almost invariably, the most expensive drugs are those used to treat rare diseases.6 Single-patient trials are likely to be vital in helping health authorities decide whether drugs that cost hundreds of thousands of dollars per year are having the desired effect.

  Like the pricing experiment in Chapter 8 – which involved just one store changing its prices from week to week – these N-of-1 experiments are a new way to use randomisation to learn about the world around us.

  *

  Because some of the most famous randomised trials involved a large number of people (like the conditional cash transfer experiment in Mexico), cost a lot of money (like the RAND Health Insurance experiments) or took many years (like the Perry Preschool experiment), randomising can sometimes seem too hard. That’s why some of today’s researchers are making it a priority to show that randomised experiments can be done quickly, simply and cheaply.

  In 2013 the Obama White House, working with a number of major foundations, announced a competition for low-cost randomised trials. The aim was to show that it was possible to evaluate social programs without spending millions of dollars. From over fifty entries, the three winners included a federal government department planning to carry out unexpected workplace health and safety inspections, and a Boston non-profit providing intensive counselling to low-income youth hoping to be the first in their family to graduate from college.7 Each evaluation cost less than $200,000. The competition continues to operate through a non-profit foundation, which has announced that it will fund all proposals that receive a high rating from its review panel.8

  Simplicity is at the core of the approach taken by the behavioural insights teams which are emerging in central government agencies across the globe. In 2010 the British government became the first to establish a so-called ‘Nudge Unit’, to bring the principles of psychology and behavioural economics into policymaking. The interventions were mostly low-cost – such as tweaking existing mailings – and were tested through randomised trials wherever possible. In some cases they took only a few weeks. Since its creation, the tiny Nudge Unit has carried out more randomised experiments than the British government had conducted in that country’s history.9

  The Nudge Unit focused on ‘low cost, rapid’ experiments.10 It found that letters asking people to pay their car tax were 9 percentage points more effective if they included a photograph of the offending vehicle, along with the caption ‘Pay Your Tax or Lose Your Car’.11 A personally scribbled note on the envelope along the lines of ‘Andrew, you really need to open this’ increased taxpaying by 4 percentage points. In an era of mass mailings, handwriting notes on envelopes is laborious, but every £1 spent on it garnered £200 in additional fines.12 For late taxpayers, the Nudge Unit experimented with various appeals, ultimately finding the most effective message to be: ‘The great majority of people in your local area pay their tax on time. Most people with a debt like yours have paid it by now.’ Adding these two sentences increased the repayment rate by 5 percentage points.13 This impact represents millions of pounds of additional revenue for an experiment that cost basically nothing.

  Other interventions are similarly cost-effective. Britons owing money to the courts were twice as likely to pay up if they were sent a text message ten days before the bailiffs were scheduled to knock on their doors.14 The texts averted 150,000 bailiff visits.15 Overseas visitors were 20 per cent more likely to leave the country on time if they received a letter before their visa expired.16 Jobseekers were nearly three times as likely to attend a recruitment event if the reminder text message was personalised and wished them good luck.17 Online, the Nudge Unit tested how best to encourage people renewing their driving licences to sign up for the organ donor registry.18 They randomly trialled eight different messages. One was a picture of smiling people and the words: ‘Every day thousands of people who see this page decide to register.’ Another had no photo, just the text: ‘If you needed an organ transplant, would you have one? If so, please help others.’ As Nudge Unit head David Halpern points out, it isn’t immediately obvious which of these would be more effective. It took a randomised trial to prove that the ‘would you have one?’ message produced 100,000 more organ donors a year.

  Foilowing the British model, Nudge Units have been established by governments in Australia, Germany, Israel, the Netherlands, Singapore and the United States, and are being actively considered in Canada, Finland, France, Italy, Portugal and the United Arab Emirates.19 An Australian study run by the Nudge Unit in New South Wales found that simply stamping ‘Pay Now’ in red at the top of a letter raised payment rates by 3 percentage points, adding $1 million to government revenues and allowing over 8000 drivers to avoid having their licences cancelled.20 Another study with St Vincent’s Hospital in Sydney randomly tested eight variants of text message reminders. Compared with the standard message, it found a 3 percentage point improvement among patients who were reminded that attending the appointment avoided the hospital losing $125.21

  BETA, the Australian government’s Nudge Unit, has collaborated with over a dozen federal departments and agencies, including the Department of Foreign Affairs, the Australian Taxation Office and the National Disability Insurance Agency. Like his British counterpart David Halpern, BETA’s founding head Michael Hiscox looked to initiate fast and simple studies, by tweaking existing programs.22 Data collection can be the most expensive part of a randomised trial, so BETA tries not to run new surveys, but instead to use existing administrative records.

  In today’s ‘big data’ era, governments (and businesses) hold more information about us than ever before. From birthweight to exam results, welfare payments to tax returns, hospital admissions to criminal records, government databases are overflowing with personal information. In some countries, this information is linked together – indeed, Scandinavian governments have now ceased taking censuses, and rely solely on administrative data. Reasonably enough, citizens expect that their information will be held private. But this should not stop agencies from using existing data to measure the impact of randomised evaluations. If big data can help get information cheaply, randomised trials become a whole lot simpler.

  *

  Another source of simple randomised trials is lotteries. As we have already seen, randomistas have studied lotteries for desirable schools and for conscripts to fight in Vietnam. Economists ha
ve even looked at cash lotteries, estimating how an unexpected windfall changes people’s lives. The typical answer is ‘less than you might expect’. For example, a study of Dutch lottery winners found that a prize equivalent to two-thirds of a year’s income leads people to upgrade their cars and buy new household appliances.23 But six months after the lottery, winning households are no happier than their unlucky neighbours. In the United States, a study looked at whether lottery winners were more likely to send their children to university, and found an effect only with very large prizes.24 Similarly, in Sweden, lottery winners were found to reduce their working hours, but only by a small amount.25

  For many of the world’s population, the biggest possible lottery win would be to move to a richer country. About 6 billion of the world’s nearly 8 billion people live in developing countries, and surveys show that at least 2 billion of them would move to a developed nation if they could.26 Facing excess demand for migration places, some advanced countries have used actual lotteries to decide who gets a spot. The argument for visa lotteries is that they place all applicants on an equal footing, regardless of their inside knowledge, wealth or personal connections.

  As it happens, lotteries also provide a powerful way for researchers to estimate the impact of moving from one country to another.27 Because migrants are self-selected, comparing the outcomes of movers and stayers could badly skew the results. Arnold Schwarzenegger isn’t just another Austrian. Martina Navratilova isn’t just another Czech. Ang Lee isn’t just another Chinese person. If we want to know how shifting countries changes a person’s life, then it’s a mistake to compare migrants with those in their home country. Instead, we need a situation in which similar people apply to move, and only chance determines who get to migrate. Visa lotteries do just that.

 

‹ Prev