The Teacher Wars

Page 27

by Dana Goldstein

To address these problems, in 2012 Michelle Rhee’s successor, Kaya Henderson, accelerated the pace at which teachers in high-poverty schools could qualify for financial bonuses tied to student performance, hoping to make working in those schools more attractive. She also decreased the amount of teachers’ evaluation scores tied to value-added, from 50 to 35 percent in tested subjects and grades, and she added a new evaluation category to reward teachers for “commitment to school community.”

Those shifts to a more holistic system of teacher evaluation were overshadowed by a series of exposés, published by Jack Gillum and Marisol Bello of USA Today, demonstrating that during Rhee’s chancellorship, the test-maker CTB/McGraw-Hill flagged hundreds of D.C. classrooms for statistically improbable answer sheet erasure rates on state tests, possible evidence that adults had corrected students’ mistakes. The average child erases zero, one, or two answers on a multiple-choice test; typical answer sheets at one D.C. school, the Noyes Education Campus, contained between five and twelve erasures, depending on the classroom. The school’s principal, Wayne Ryan, resigned in disgrace, but only after collecting $20,000 in bonuses attached to test score increases.

Noyes was not an isolated case. Increasingly, there was evidence that a significant number of unscrupulous administrators and teachers nationwide had responded to the higher stakes attached to state-level standardized tests—evaluations, bonus pay, and public release of data—by cheating. The same USA Today team that revealed the D.C. irregularities studied six other states and found over sixteen hundred examples of probable test score manipulation between 2002 and 2010. (The newspaper would have almost certainly found even more cheating had it not zeroed in on only the most suspicious test score leaps: those that statisticians said were about as likely to be legitimate as a Powerball ticket was to be a winner. For example: At one Gainesville, Florida, elementary school, math proficiency rates jumped from 5 percent to 91 percent in three years.) A subsequent investigation by the Atlanta Journal-Constitution discovered 196 school districts across the country with suspicious test score gains.

Atlanta itself was the site of the nation’s most infamous recent cheating scandal. On March 29, 2013, thirty-five Atlanta teachers and administrators, including the city’s former superintendent, Beverly Hall, were indicted. The grand jury report revealed a shockingly sick culture of adult cheating, in which Hall, who had been the 2009 national “superintendent of the year,” fired whistle-blowers and protected the jobs of employees who purposefully sat struggling kids next to high-performing ones to encourage cheating on tests, and who gathered at afterschool “erasure parties” to correct multiple-choice answer sheets before submitting them to be graded. Teachers and principals in Atlanta could earn thousands of dollars in bonuses for raising scores; Hall’s bonuses totaled $580,000.

In the wake of this appalling ethical lapse, which resulted in thousands of Atlanta children—largely poor and black—being told they had acquired crucial academic skills they actually lacked, accountability reformers rushed to defend high-stakes testing policies. “The existence of cheating says nothing about the merits of testing,” Arne Duncan argued in the Washington Post. Bill Gates said that cheating represented just a “tiny” rounding error in the landscape of standardized testing. They all advocated blaming the adult cheaters while absolving the policies that provided incentives to cheat.

Even where no systemic cheating was alleged, there were disappointments with the new teacher evaluation schemes. When New York City released value-added data for individual teachers in 2012—and the Times and other news organizations made them searchable by teachers’ names—the margin of error was a staggering 53 points out of 100 for English teachers and 35 points out of 100 for math teachers. Numbers like that forced even strong supporters of data-driven accountability, including Bill Gates, Wendy Kopp, and Doug Harris, to speak out against the public release of such data. Kati Haycock began to worry that reformers, including many of her allies, had run “roughshod over those who were anxious about whether value-added was strong enough to support all of this … There are voices who said, ‘Do it anyway! This is the moment!’ Those people may still be right, but I count myself among a group of folks who are saying, as mad as I was about how slow we went before Race to the Top, I think I might be almost more upset now about the decision to go faster than these systems can handle.”

Chester Finn, the moderate Republican reformer and former assistant secretary of education in the Reagan administration, agrees. “We’ll probably discover ten years from now you can’t do truly quantitative achievement-based evaluation of teachers with any great reliability,” he told me. This is the typical hype-disillusionment cycle in American education reform, driven by moral panic about bad teaching.

Already there is some evidence that the new Race to the Top evaluation systems are failing to meaningfully distinguish between teachers, in much the same way that past evaluation systems failed. In Michigan and Tennessee in 2012, 98 percent of teachers were rated effective or better; in Florida, 95 percent; and in Georgia, 94 percent—numbers hardly different from those under the old systems.

It is unclear exactly why this is happening, but we can wager a few guesses. It could be that, as in the past, principals are not taking the time to thoroughly evaluate each teacher on the classroom observation components of these systems, either because of the large administrative burden this imposes—Florida’s observation system requires ratings in sixty categories for each teacher—or because they lack the training in how to do so.

Teachers union leaders have suggested the low ineffective rates prove that only a tiny fraction of teachers, after all, are bad at their jobs. Before you dismiss this response as self-serving, consider this: Even tough reformers like Colorado state senator Mike Johnston say they’d like to see only the bottom 5 to 10 percent of teachers fired each year. Economist Eric Hanushek has even written, “The majority of [American] teachers are effective. They are able to compete with teachers virtually anywhere else in the world.” If only a small minority of teachers are truly terrible, then evaluation systems that flag 2 to 6 percent of a state’s teachers as problematic, produce layoffs of 10 percent of teachers in D.C., and deny or defer 50 percent of teachers tenure in New York City represent a huge step forward toward a more accountable profession. In New Haven, a new union contract eliminated tenure protections for just the 2 percent of teachers declared “ineffective” annually. Superintendent Garth Harries, an accountability reformer, is satisfied. “I think the 2 percent represents a real and significant number of teachers,” he told me. “In the end, it’s not a huge number, but the fact that these teachers are, in fact, leaving for reasons directly rated to performance has a fairly profound impact on the rest of the force. Folks saying, ‘Thank God!’ and folks saying, ‘They’re serious. I have to make sure I get my act together!’ If we’re truly going to have a professional construct for teaching, I don’t think there’s a set number of teachers we remove and then we’re done. I don’t think I’d want it to be below 2 percent [annually]. But I’d be perfectly happy with 2 percent in perpetuity.”

Jonah Rockoff, a coauthor of the landmark value-added study linking test score growth to later income, says that because of concerns over teaching to the test, the next frontier for research will be to measure a teacher’s impact in new ways. That could be done by looking at how teachers influence student behavior, attendance, or GPA. “We all know test scores are limited not just in their power and accuracy, but in the scope of what we want teachers and schools to be teaching our kids,” Rockoff said. “If we had a more holistic view of teaching, that would be great. But I don’t mean touchy-feely, ‘you can teach however you want.’ It’s the idea that there’s not just one thing we care about our kids learning. We’re going to measure how kids do on socio-cognitive outcomes and reward teachers on that, too.”

But as Arne Duncan has acknowledged, states can’t simply use value-added to “fire their way to the top.” Even
if test scores were a flawless reflection of student learning and teacher quality, there is no evidence that the new teachers who replace the bad teachers will be any better—it is practically impossible to predict, via demographic traits, test scores, grades, or pathway into the profession, who will become an effective teacher.

Research and experience demonstrate that it makes good sense to tie teacher tenure and job security more closely to performance, and less to seniority. The contract provisions of the 1960s and 1970s make less sense now that we know so much more about how teachers’ mind-sets and practices impact children’s learning. But the history of American public education shows that teachers are uniquely vulnerable to political pressures and moral panics that have nothing to do with the quality of their work. Even Michelle Rhee says she believes in due process, as long as the process of grieving a termination is conducted quickly. “I’d seen too many examples of good teachers who had been railroaded by ineffective administrators,” she wrote in her memoir. “Those teachers had to have a structure through which they could appeal evaluations when appropriate.”

If the key to systemwide improvement is not through mass firings or union-busting, then what remains is to turn the existing average teacher into an expert practitioner, what Rockoff calls “moving the big middle” of the teaching profession. That effort will require a lot more than data—it will require a shared vision of what excellent teaching looks like, and the mentorship and training to get teachers there.

* * *

*1 Note how different this top-down theory of change is from that of the community control movement, in which parents at the grassroots level were conceived of as the vanguard of school reform.

*2 In Visible Learning, John Hattie notes that while researchers generally have trouble locating the effects of teachers’ content knowledge on student outcomes, there is other evidence suggesting that teachers’ general intellectual ability, particularly vocabulary and verbal facility, are positively associated with student achievement gains. These skills, however, may have very little to do with the competitiveness of a teacher’s college or graduate school, or the content of the classes he or she took there.

*3 In 2007 TFA sent 13 percent of corps members to charter schools. In 2013, as recession budget cuts slowed district hiring, one-third of corps members were hired by charter schools and about half of alumni still teaching were working in charters. Not all charters are “no excuses” schools. Some, like Global Community in Harlem and Community Roots in Brooklyn, emphasize project-based learning and other progressive pedagogies.

*4 Johnston’s enthusiasm for test-score-based accountability was a sign of changing times. Less than a decade earlier he had published a poignant memoir about his time as a TFA corps member in the Mississippi Delta, in which he complained about “innumerable state testing sessions” and “the furor to try to improve test scores.”

*5 School closings have emerged as one of the most controversial issues in education reform. Closings are sold as a way to get kids into better schools. But according to the Consortium on Chicago School Research, only 6 percent of Chicago students whose schools were shut down ended up enrolled in a school within the top achievement quartile, and 40 percent of students from closed schools ended up at schools that were on academic probation.

*6 Districts and charters pay TFA $2,000 to $5,000 per corps member, which helps cover the costs of the summer institute and the support TFA provides to its teachers during the school year.

*7 The foundation run by the Walton family, the descendants of the Walmart founders, is a key TFA funder, and has also contributed to the National Right to Work Legal Defense Foundation, an anti-union group.

• Chapter Ten •

“Let Me Use What I Know”

REFORMING EDUCATION BY EMPOWERING TEACHERS

To many American teachers, the last decade of value-added school reform has felt like something imposed on them from outside and from above—by politicians with little expertise in teaching and learning, by corporate philanthropists who long to remake education in the mold of the business world, and by economists who see teaching as less of an art than a science. According to a 2013 poll conducted by Scholastic and the Gates Foundation, the majority of American teachers feel alienated from education policy making, with only a third reporting that their opinions are valued at the district level, 5 percent reporting they are valued at the state level, and just 2 percent reporting they are valued at the national level. Those frustrations have begun to break into the public debate. Dissident teachers and their unions are winning support from parent activists who are protesting the increased number of standardized tests, the time spent on test prep, and the lack of instructional time for projects, field trips, art, and music. Testing is a part of any functional education system, but in recent years it has often seemed like the horse of school improvement has been driven by the cart of collecting student data to be used in teacher evaluation. Meanwhile, more and more accountability reformers acknowledge that new teacher evaluation systems are not a panacea. They identify only a small number of teachers as ineffective, and do nothing, on their own, to guarantee that teachers’ skills will actually improve over time. The hope that collecting more test scores will raise student achievement is like the hope that buying a scale will result in losing weight. We now have a lot of numbers to back up our inkling that something is wrong. But if we don’t start improving instruction in the classroom, those numbers simply will not change.

“No excuses” strategies are not the only promising avenue for instructional reform. In the long term, reform programs that combine high-stakes standardized tests with scripted lesson plans and a limited arsenal of pedagogical strategies may make teaching a less attractive job for exactly the sort of ambitious, creative, high-achieving people we most want to attract. Polls of teachers who leave the profession show many did so because they received no constructive feedback on their practice, they had too little time to think creatively and collaborate with colleagues, and they had no opportunity to take on additional responsibilities and grow as professionals. So the next step in American education reform may be to focus less on top-down efforts to ferret out the worst teachers or turn them into automatons, and more on classroom-up interventions that replicate the practices of the best. Today reformers across the country are experimenting with empowering teachers to coach their peers, to remake teacher education, to design creative curriculum materials, and to lead school turnaround efforts. These practices conceive of veteran teachers as assets, not liabilities. As history has taught us, that is a pragmatic stance crucial to sustaining any reform program, which teachers must carry out on the ground.

Race to the Top focused attention on the teacher evaluation process, particularly on how student test scores are used to judge teachers. But in every state, a large, if not dominant, part of a teacher’s evaluation score is still tied to classroom observations.

Observation is a challenging endeavor, in large part because it can be so subjective. Remember William Maxwell? He was the superintendent in turn-of-the-century New York City who complained that 99.5 percent of teachers were being evaluated as “good.” He created a complex new A–D system based on principal observations and ratings, in which, it turned out, the vast majority of principals rushed through the motions and gave all their teachers a B+. For over a century, classroom observation has failed to successfully differentiate between teachers. So how can that change? How can observation capture what everyone knows—that some teachers are better than others—and what everyone doesn’t yet know: What exactly makes them that way?

The importance of looking beyond value-added measurement to carefully watch how teachers work with children is underscored by new research on what actually occurs in many classrooms, especially those populated by low-income students. In 2009 economist Thomas Kane and the Bill and Melinda Gates Foundation began a massive study on teacher effectiveness, known as the MET (Measures of Effective Teaching) project. MET co
llected videos of 1,333 teachers at work and gave them to highly trained evaluators to analyze. The experts found that only a third of the classrooms showed evidence of teachers promoting intellectual growth beyond rote learning.

That aligns with past research. A 2011 observation of elementary school classrooms in Baltimore showed that the majority of teachers failed to use challenging vocabulary words, failed to ask questions that probed for conceptual understanding (as opposed to simply correct answers), and rarely led their classes in whole-group discussions. In the weeks before state standardized tests, the Baltimore teachers engaged in those desirable activities even less frequently than usual and also decreased their personal interactions with students, who were “spending a good deal of their time on paper and pencil skill-based worksheets that did not require critical thinking or collaboration,” the researchers reported. A 2009 review of the research literature on teacher practices, including several studies of thousands of elementary school classrooms across the country, found that low-income children are likely to spend their school days drilling in low-level skills, like spelling, and watching teachers deal with poorly behaved students.

Maybe none of that matters, if multiple-choice worksheets help children learn. But research shows that when teachers promote more interactions among students and focus their lessons on concepts that are broader and more challenging than those represented on multiple-choice tests, children’s scores on higher-level assessments—like those that require writing—actually go up. Rigorous, interactive classrooms promote higher student achievement.

‹ Prev Next ›