by Paul Tough
But as every middle-school teacher knows, convincing students of that logic is a lot harder than it seems. Motivation, it turns out, is quite complex, and rewards sometimes backfire. In their book Freakonomics, Steven Levitt and Stephen Dubner recount the story of a study researchers undertook in the 1970s to see if giving blood donors a small financial stipend might increase blood donations. The result was actually that fewer people gave blood, not more.
And while the M&M test suggests that giving kids material incentives to succeed should make a big difference, in practice, it often doesn’t work that way. In recent years, the Harvard economist Roland Fryer has essentially tried to extend the M&M experiment to the scale of a metropolitan school system. He tested several different incentive programs in public schools—offering bonuses to teachers if they improved their classes’ test results; offering incentives like cell-phone minutes to students if they improved their own test results; offering families financial incentives if their children did better. The experiments were painstaking and carefully run—and the results have been almost uniformly disappointing. There are a couple of bright spots in the data—in Dallas, a program that paid young kids for each book they read seems to have contributed to better reading scores for English-speaking students. But for the most part, the programs were a bust. The biggest experiment, which offered incentives to teachers in New York City, cost seventy-five million dollars and took three years to conduct. And in the spring of 2011, Fryer reported that it had produced no positive results at all.
7. The Coding-Speed Test
This is the problem with trying to motivate people: No one really knows how to do it well. It is precisely why we have such a booming industry in inspirational posters and self-help books and motivational speakers: what motivates us is often hard to explain and hard to measure.
Part of the complexity is that different personality types respond to different motivations. We know this because of a series of experiments undertaken in 2006 by Carmit Segal, then a postdoctoral student in the Harvard economics department and now a professor at a university in Zurich. Segal wanted to test how personality and incentives interacted, and she chose as her vehicle one of the easiest tests imaginable, an evaluation of basic clerical skills called the coding-speed test. It is a very straightforward test. First, participants are given an answer key in which a variety of simple words are each assigned a four-digit identifying number. The list looks something like this:
game 2715
chin 3231
house 4232
hat 4568
room 2864
And then a little lower on the page is a multiple-choice test that offers five four-digit numbers as the potential correct answer for each word.
Questions Answers
A B C D E
1. hat 2715 4232 4568 3231 2864
2. house 4232 2715 4568 3231 2864
3. chin 4232 2715 3231 4568 2864
All you have to do is find the right number from the key above and then check that box (1C, 2A, 3C, etc.). It’s a snap, if a somewhat mind-numbing one.
Segal located two large pools of data that included scores from thousands of young people on both the coding-speed test and a standard cognitive-skills test. One pool was the National Longitudinal Survey of Youth, or NLSY, a huge survey that began tracking a cohort of more than twelve thousand young people in 1979. The other was a group of military recruits who took the coding exam as part of a range of tests they had to pass in order to be accepted into the U.S. Armed Forces. The high-school and college students who were part of the NLSY had no real incentive to exert themselves on the tests—the scores were for research purposes only and didn’t have any bearing on their academic records. For the recruits, though, the tests mattered very much; bad scores could keep them out of the military.
When Segal compared the scores of the two groups on each test, she found that on average, the high-school and college kids did better than the recruits on the cognitive tests. But on the coding-speed test, it was the recruits who did better. Now, that might have been because the kind of young person who chose to enlist in the armed forces was naturally gifted at matching numbers with words, but that didn’t seem too likely. What the coding-speed test really measured, Segal realized, was something more fundamental than clerical skill: the test takers’ inclination and ability to force themselves to care about the world’s most boring test. The recruits, who had more at stake, put more effort into the coding test than the NLSY kids did, and on such a simple test, that extra level of exertion was enough for them to beat out their more-educated peers.
Now, remember that the NLSY wasn’t just a one-shot test; it tracked young people’s progress afterward for many years. So next Segal went back to the NLSY data, looked at each student’s cognitive-skills score and coding-speed score in 1979, and then compared those two scores with the student’s earnings two decades later, when the student was about forty. Predictably, the kids who did better on the cognitive-skills tests were making more money. But so were the kids who did better on the super-simple coding test. In fact, when Segal looked only at NLSY participants who didn’t graduate from college, their coding-test scores were every bit as reliable a predictor of their adult wages as their cognitive-test scores. The high scorers on the coding test were earning thousands of dollars a year more than the low scorers.
And why? Does the modern American labor market really put such a high value on being able to compare mindless lists of words and numbers? Of course not. And in fact, Segal didn’t believe that the students who did better on the coding test actually had better coding skills than the other students. They did better for a simple reason: they tried harder. And what the labor market does value is the kind of internal motivation required to try hard on a test even when there is no external reward for doing well. Without anyone realizing it, the coding test was measuring a critical noncognitive skill that mattered a lot in the grown-up world.
Segal’s findings give us a new way of thinking about the so-called low-IQ kids who took part in the M&M experiment in south Florida. Remember, they scored poorly on the first IQ test and then did much better on the second test, the one with the M&M incentive. So the question was: What was the real IQ of an average “low-IQ” student? Was it 79 or 97? Well, you could certainly make the case that his or her true IQ must be 97. You’re supposed to try hard on IQ tests, and when the low-IQ kids had the M&M’s to motivate them, they tried hard. It’s not as if the M&M’s magically gave them the intelligence to figure out the answers; they must have already possessed it. So in fact, they weren’t low-IQ at all. Their IQs were about average.
But what Segal’s experiment suggests is that it was actually their first score, the 79, that was more relevant to their future prospects. That was their equivalent of the coding-test score, the low-stakes, low-reward test that predicts how well someone is going to do in life. They may not have been low in IQ, but they were low in whatever quality it is that makes a person try hard on an IQ test without any obvious incentive. And what Segal’s research shows is that that is a very valuable quality to possess.
8. Conscientiousness
So what do you call the quality exhibited by Segal’s go-getters, the kids who exerted themselves whether or not there was a potential reward? Well, here’s the technical term that personality psychologists use: conscientiousness. Over the past couple of decades, a consensus has emerged among personality psychologists that the most effective way to analyze the human personality is to consider it along five dimensions, known as the Big Five: agreeableness, extraversion, neuroticism, openness to experience, and conscientiousness. And when Segal gave the male students in one of her surveys a standard personality test, the ones who didn’t respond to material incentives—who did well whether or not there were M&M’s involved—scored particularly high on conscientiousness.
Within the world of personality psychology, the reigning expert on conscientiousness is Brent Roberts, a professor at the University of Illino
is at Urbana-Champaign who has collaborated with both James Heckman, the economist, and Angela Duckworth, the psychologist. Roberts told me that in the late 1990s, when he was getting out of grad school and deciding what research field to specialize in, no one wanted to study conscientiousness. Most psychologists considered it to be the black sheep of the personality field. Many still do. It’s a cultural thing, Roberts explained. Like the word character, the word conscientiousness has some strong and not always positive associations outside academia. “Researchers prefer to study the things they value,” he told me. “And the people in society who value conscientiousness are not intellectuals, and they’re not academics, and they’re not liberals. They tend to be religious-right conservatives who think people should be more controlled.” (According to Roberts, psychologists prefer to study openness to experience. “Openness is just cool,” he explained, a little ruefully. “It’s about creativity. Plus, it has the strongest correlation with liberal ideology. Most of us in personality psychology—including me, I should say—are liberal. And we like studying ourselves.”)
Though academic personality psychologists, with the lonely exception of Roberts, mostly stayed away until recently, Big Five conscientiousness was embraced in the 1990s by a less illustrious psychological specialty: industrial/organizational psychology, or I/O psychology. Researchers in that field rarely hold positions with prestigious universities; most of them work as consultants for human-resource managers in large corporations that have a very specific need, far removed from esoteric academic debates: they want to hire the most productive, reliable, and diligent workers they can find. When I/O psychology began using various personality assessments to help corporations identify those workers, they found consistently that Big Five conscientiousness was the trait that best predicted workplace success.
What intrigues Roberts most about conscientiousness is that it predicts so many outcomes that go far beyond the workplace. People high in conscientiousness get better grades in high school and college; they commit fewer crimes; and they stay married longer. They live longer—and not just because they smoke and drink less. They have fewer strokes, lower blood pressure, and a lower incidence of Alzheimer’s disease. “It would actually be nice if there were some negative things that went along with conscientiousness,” Roberts told me. “But at this point it’s emerging as one of the primary dimensions of successful functioning across the lifespan. It really goes cradle to grave in terms of how well people do.”
9. The Downside of Self-Control
Of course, that doesn’t mean that everyone agrees that conscientiousness is an entirely positive thing. In fact, some of the first empirical evidence for the connections between conscientiousness and success in school and the workplace came from people who didn’t think much of either school or the workplace. In their 1976 book Schooling in Capitalist America, the Marxist economists Samuel Bowles and Herbert Gintis argued that American public schools had been set up to perpetuate social-class divisions. In order for capitalists to keep proletarians in their class of origin, they wrote, “the educational system must try to teach people to be properly subordinate.” Bowles and Gintis drew on contemporary research by Gene Smith, a psychologist who had found that the test that most reliably predicted a high-school student’s future didn’t measure IQ; it measured how a student’s peers rated him on a trait Smith called “strength of character,” which included being “conscientious, responsible, insistently orderly, not prone to daydreaming, determined, persevering.” This measure was three times more successful in predicting college performance than any combination of cognitive ratings, including SAT scores and class rank. Intrigued by Smith’s results, Bowles and Gintis and a colleague undertook a new research project, subjecting all 237 students in the senior class of a big high school in New York state to a variety of IQ and personality tests. They found, as expected, that cognitive scores were quite predictive of GPA, but an index they derived from a combination of sixteen personality measures, including conscientiousness, had an equivalent predictive power.
To psychologists like Seligman and Peterson and Duckworth and Roberts, these results are a resounding demonstration of the importance of character to school success. To Bowles and Gintis, they were evidence that the school system was rigged to create a docile proletariat. Teachers rewarded repressed drones, according to Bowles and Gintis; they found that the students with the highest GPAs were the ones who scored the lowest on measures of creativity and independence, and the highest on measures of punctuality, delay of gratification, predictability, and dependability. Bowles and Gintis then consulted similar scales for office workers, and they found that supervisors judged their workforce the way teachers judged their students. They gave low ratings to employees with high levels of creativity and independence and high ratings to those workers with high levels of tact, punctuality, dependability, and delay of gratification. To Bowles and Gintis, these findings confirmed their thesis: Corporate America’s rulers wanted to staff their offices with bland and reliable sheep, so they created a school system that selected for those traits.
According to Roberts’s research, people who score high on conscientiousness tend to share certain characteristics: they are orderly, hard-working, reliable, and respectful of social norms. But perhaps the most important ingredient of conscientiousness is self-control. And when it comes to self-control, Marxist economists are not the only people who are skeptical of its value.
In Character Strengths and Virtues, Peterson and Seligman contended that “there is no true disadvantage of having too much self-control”; it is a capacity, like strength or beauty or intelligence, with no inherent downside—the more you have, the better. But an opposing school of thought, led by the late Jack Block, a psychological researcher at the University of California at Berkeley, argued that too much self-control could be just as big a problem as too little. Overcontrolled people are “excessively constrained,” Block and two colleagues wrote in one paper. They “have difficulty making decisions [and] may unnecessarily delay gratification or deny themselves pleasure.” According to these researchers, conscientious people are classic squares: they’re compulsive, anxious, and repressed.
Block’s findings are certainly valid; it’s easy to see how conscientiousness can descend into compulsiveness. But at the same time, it is hard to argue with the data showing correlations between self-control and positive outcomes. In 2011, that pool of evidence grew further when a team of researchers published the results of a three-decade-long study of more than a thousand young people in New Zealand that showed, in new detail, clear connections between childhood self-control and adult outcomes. When their subjects were between the ages of three and eleven, the researchers, led by the psychologists Avshalom Caspi and Terrie Moffitt and including Brent Roberts, used a variety of tests and questionnaires to measure the children’s self-control and then combined those results into a single self-control rating for each child. When they surveyed the subjects at age thirty-two, they found that the childhood self-control measure had predicted a wide array of outcomes. The lower a subject’s self-control in childhood, the more likely he or she was at thirty-two to smoke, to have health problems, to have a bad credit rating, and to have been in trouble with the law. In some cases, the effect sizes were huge: Adults with the lowest self-control scores in childhood were three times more likely to have been convicted of a crime than those who scored highest as kids. They were three times more likely to have multiple addictions, and they were more than twice as likely to be raising their children in a single-parent household.
10. Grit
But even Angela Duckworth agrees that self-control has its limitations. It may be very useful for predicting who will graduate from high school, but, she says, it’s not as relevant when it comes to identifying who might invent a new technology or direct an award-winning movie. And after publishing her groundbreaking self-control-versus-IQ study in Psychological Science in 2005, Duckworth began to sense that self-control wasn
’t precisely the driver of success that she was looking for. She considered her own career. She was, by objective measures, very intelligent, and she recognized that she had high levels of self-discipline: she got up early; she worked hard; she met deadlines; she made it to the gym on a regular basis. And though she was certainly successful—very few doctoral students have their first-year theses published in a prestigious journal like Psychological Science—her peripatetic early career was much less directed than that of, say, David Levin, who had found his life’s calling at twenty-two and had persisted at the same goal ever since, overcoming many obstacles and creating, with Michael Feinberg, a successful network of charter schools educating thousands of students. Duckworth felt that Levin, who was about her age, possessed some trait that she did not: a passionate commitment to a single mission and an unswerving dedication to achieve that mission. She decided she needed to name this quality, and she chose the word grit.
Working with Chris Peterson, Seligman’s coauthor on Character Strengths and Virtues, Duckworth developed a test to measure grit, which she called the Grit Scale. It is a deceptively simple test, just twelve brief statements on which respondents must evaluate themselves, including “New ideas and projects sometimes distract me from previous ones”; “Setbacks don’t discourage me”; “I am a hard worker”; and “I finish whatever I begin.”
For each statement, respondents score themselves on a five-point scale, ranging from 5, “very much like me,” to 1, “not like me at all.” The test takes about three minutes to complete, and it relies entirely on self-report—and yet when Duckworth and Peterson took it out into the field, they found it was remarkably predictive of success. Grit, Duckworth discovered, is only faintly related to IQ—there are smart gritty people and dumb gritty people—but at Penn, high grit scores allowed students who had entered college with relatively low college-board scores to nonetheless achieve high GPAs. At the National Spelling Bee, Duckworth found that children with high grit scores were more likely to survive to the later rounds. Most remarkable, Duckworth and Peterson gave their grit test to more than twelve hundred freshman cadets as they entered the military academy at West Point and embarked on the grueling summer training course known as Beast Barracks. The military has developed its own complex evaluation, called the whole candidate score, to judge incoming cadets and predict which of them will survive the demands of West Point; it includes academic grades, a gauge of physical fitness, and a leadership potential score. But the more accurate predictor of which cadets persisted in Beast Barracks and which ones dropped out turned out to be Duckworth’s simple little twelve-item grit questionnaire.