Book Read Free

This Is the Voice

Page 25

by John Colapinto


  In healthy democratic societies, a free press serves as the best check on demagoguery, but no more. Trump had first joined Twitter in March 2009—and he quickly understood that it had replaced old media as the prime tool for mass influence and persuasion (“It’s like owning your own newspaper—without the losses,” he tweeted in November 2012). As a candidate, Trump used the platform as an eerily exact extension of his corporeal voice: his tweets—often typo-riddled, in all caps, and with multiple exclamation points—were issued like the vocal blows he used to pummel opponents into silence. “Hillary will never reform Wall Street. She is owned by Wall Street!”64 “#WheresHillary? Sleeping!!!!!”65 “AMERICA FIRST!”66 Meanwhile, he exhorted his followers to ignore what the mainstream press was saying about him. They were “fake news” and (in a phrase borrowed from Goebbels, who’d used it to demonize the Jews) “the enemy of the people.”

  The frustration on the part of old media entities to act as a check on Trump was palpable. In November 2017, ten months into Trump’s presidency, MSNBC host Lawrence O’Donnell fumed over how Trump and his surrogates could continue to lie and contradict themselves without facing any consequences. In this particular instance, O’Donnell was incensed by comments from Bannon, who had performed a whiplash-inducing 180-degree turn on a statement from just hours before. People must be “fools,” O’Donnell sputtered, not to see that Bannon, like Trump, is a “hustler” and “liar.”

  “But he’s not exactly a liar,” said O’Donnell’s guest, pundit David Frum, who understood perfectly how Trump and those around him used their voices. “Even for a liar, words have meaning,” Frum said. “Think of it as music. He’s a musician. These are tones of grievance and anger. And it is an important thing to understand because we’re going to hear more of this in the coming months and years.”67

  The best explanation of how that music of “grievance and anger” won the support of over sixty million Americans is found in a book written before Trump even announced his candidacy (although published afterward, in 2016): Hillbilly Elegy, a memoir by J. D. Vance of growing up in the Rust Belt city of Middletown, Ohio, in the 1990s and early 2000s. A once thriving steel town, Middletown had, like so many manufacturing cities, been destroyed by the offshoring of its factories and the subsequent widespread unemployment that led, first, to creeping despair, then to alcohol and opioid addiction, and finally to the loss of all hope.

  Middletown is a working-class white population suspicious, Vance wrote, of “outsiders or people who are different from us, whether the differences lie in how they look, how they act, or, most important, how they talk” (my italics).68 This hypersensitivity to the vocal channel, and its in-group/out-group signaling (which would have surprised Labov not at all), extended to the person who occupied the White House during the time that Vance was writing: Barack Obama, about whom Middletownians felt unease and even dislike—emotions born not of racism, Vance insists, but of the understandable envy they felt for Obama’s success, education, and healthy, happy family. Nothing made Obama seem more “alien” to the despairing citizens of Middletown, Vance wrote, than his voice. A thing of inspiration to those Americans fortunate enough to share Obama’s prosperity and prospects, that voice was, to people in Middletown, his most off-putting characteristic, precisely because it embodied everything admirable in the man, everything they had lost or never had. “His accent—clean, perfect, neutral—is foreign,” Vance wrote.69

  It is impossible to read these passages about Obama, and about the depths to which Middletownians had sunk, and still wonder why they, and millions of other Americans across the Rust Belt and other failing parts of America, were galvanized by the sound of Donald Trump’s venomous, vengeful voice.

  Since Trump’s victory, America has learned some harsh lessons about what happens when a leader seething with prejudice and anger uses the mechanisms of democracy to seize the national microphone. His voice amplifies voices like his own. On the very day of Trump’s electoral triumph, former grand wizard of the KKK David Duke, who had not been heard from in the mainstream media in over thirty years, was suddenly once again all over television screens, voicing his delight at Trump’s election—and tweeting to Trump: “We did it!” Mediagenic white supremacist Richard Spencer was extensively interviewed on all the major television networks and profiled in magazines and newspapers. On August 11, 2017, seven months into Trump’s presidency, the world was treated to the scarcely believable sight of hundreds of torch-bearing American neo-Nazis who had converged on the city of Charlottesville, Virginia, to stage a “Unite the Right” rally. Marching through the city, faces brazenly exposed to cameras as if they no longer had anything to fear in the way of retribution from employers or anyone else, they chanted “Jews will not replace us” and “Blood and Soil”—phrases drawn from de Sales’s English translation of Hitler’s speeches.

  * * *

  Given how faithful an index the human voice is to the thoughts, disposition, personality, intelligence, and judgment of a human being, there is every reason to feel fearful when someone like Donald Trump becomes the most powerful person in the world. The very sound of his voice—apart even from the ugliness of his utterances—announces his low impulse control and roiling anger, his tendency to reach for rash and oversimplified solutions to complex problems, his animal need to dominate—to “be a killer” (as his father put it to him in childhood).70 One hesitates to imagine how such a man might react if called upon to exercise the nuanced decision making and restraint demonstrated by John F. Kennedy during the Cuban Missile Crisis, qualities of mind and temperament uniquely demanded of the leaders of superpowers in an age of thermonuclear weaponry.

  Nor is nuclear Armageddon the only nightmare scenario that this perilous moment calls to mind. As I write, the leaders of every country in the world are grappling with how to stem a pandemic, the coronavirus—a crisis that demands global cooperation, coolheadedness, strategic reasoning, and the ability for decision makers to look beyond their own personal advancement. The president of the United States initially dismissed fears about the virus as a “hoax” perpetrated by a liberal mainstream media, working in collusion with a Democratic House of Representatives, to sabotage his reelection chances by cratering the stock market and depressing the economy. Precious weeks passed in inaction as the virus took hold across the United States, another sobering reminder of the dangers that the ancient Greeks warned of in democracy—a system where a person, no matter how unfit, can speak his way to power and, through misuse of that power, threaten to bring our species, and every other, to extinction.

  EIGHT SWAN SONG

  A vision of our annihilation in a nuclear conflagration or unchecked pandemic is too grim a note upon which to end this tribute to the gifts that we give, and are given, through the voice. Arguably, the greatest of those gifts is one we’ve only glanced at thus far: song, which I have addressed primarily in connection with Darwin’s insight that language began with singing. For some, however, the musical origins of speech are precisely why singing is a nonessential subject of inquiry into what made us human. Steven Pinker famously states that the melody and rhythm that early humans recruited for language render music a topic of secondary interest, at best, since these prosodic features provided nothing like the decisive evolutionary advantages that words and grammar did. We respond so powerfully to music, Pinker says, simply because it exaggerates the musical channel of speech, widening the swoops in melody, accentuating the beats and rhythms of talking, and thereby pushing harder on the buttons in the brain’s pleasure centers that early humans recruited for deciphering language. Thus, Pinker’s description of music as “auditory cheesecake”1: delicious for its artery-clogging overabundance of fats and sugar (or melody and rhythm), but hardly essential to life. Pinker concludes that “music could vanish from our species and the rest of our lifestyle would be virtually unchanged.”2

  For anyone (like myself) who draws such pleasure and solace from both making and listening to music, it is
hard to credit this view, but to be fair to Pinker, it is also hard to explain the peculiar power and importance to our existence of music and song. President Obama inadvertently hinted at it, in June 2015, when he led a memorial service for the nine African Americans slaughtered at a church in Charleston, South Carolina, by a Glock-wielding white supremacist. A minute or two into his spoken remarks, Obama paused for an extraordinary twelve seconds. When he broke the silence, it was with the opening note of “Amazing Grace.” That one of our most eloquent orators found the medium of speech inadequate to that moment of shared anguish was telling. Whatever else Obama’s rendition of “Amazing Grace” was, it was not auditory cheesecake. It was far more nourishing (emotionally, psychologically, and, in the parlance of faith, spiritually), and it proved the mysterious power, and fundamental necessity, of singing when words alone fail us.

  * * *

  The first formal inquiries into the power of song were made in the 1920s by researchers using a then-revolutionary technology for capturing the voice and rendering it visually: phonophotography, which translated the vocal sound wave into a light beam focused onto a moving spool of motion picture film. This method captured, with surprising exactitude, the voice’s tiniest variations in pitch, timing, and volume as it delineated a melody, turning the signal into a wiggling horizontal line much like readings from our modern-day oscilloscopes. The technology had been used (by the aforementioned Gladys Lynch)I to study emotions in everyday speech, but Carl E. Seashore, a psychology professor at the University of Iowa, was the first to use it to analyze how the voice moves us so deeply when raised in song. Seashore and his colleague Milton Metfessel chose to study a style of singing renowned for its strong emotionality, but also considered untranscribable by standard musical notation: “negro folk singing,” as they called it—that is, the work songs and spirituals sung by African American populations in Georgia, Tennessee, and North Carolina. Derived from the songs that slaves had created in the eighteenth and nineteenth centuries, this singing would go on to form the basis for all jazz, blues, and rock singing, including that of white artists like Elvis Presley, the Beatles, and Mick Jagger, to name only a few of the most famous.

  The resulting study, Phonophotography in Folk Music: American Negro Songs in New Notation (1928), is a rare and remarkable document. In it, Seashore and Metfessel analyze the singing voices of African American farmhands and domestic workers, and they isolate a number of specific acoustic features that give the music its emotional power: including the bends in pitch that create the impression of sadness in blues singing, and something they call the falsetto-twist, which “resembles the voice breaking under emotion, especially grief.”3 But their primary discovery is how often voices that, to the naked ear, sound like they are perfectly on pitch, and strictly in rhythm, deviate widely from these norms, sliding well below or above the “correct” pitch, delaying an attack on a note for several milliseconds after the beat, or pouncing on a tone early, depending on the desired emotional effect. When Seashore and Metfessel, for comparison, analyzed the voices of trained opera singers, they professed themselves “stunned” to discover that the most highly trained classical singers do the same thing as those folk singers: edging on or off a pitch, coming in slightly before the beat or slightly after. The authors concluded that such carefully modulated imprecision is a feature of all emotionally affecting vocal music—and, indeed, of all great art. The minuscule deviations from a mechanically “perfect” rendering of a song were akin to the expressive imprecision that the Renaissance Italians extolled as sprezzatura in oil painting and drawing, or that Chinese ink-and-brush painters prize in calligraphy and other painting, or Jack Kerouac promoted in the “automatic,” off-the-top-of-his-head writing of On the Road—a freedom and brio that openly flouts perfection, but that gives vibrancy and emotional life. “The principle involved is a well-recognized theory of art,” Seashore and Metfessel wrote.

  This explains in part why so many of the current chart-topping songs in every genre (pop, country and western, hip-hop, rock), for all the virtuosity of their sometimes highly ornamented singing (glissandos, melisma, glass-shatteringly high pitches), sound cold, sterile, emotionally null: virtually all music producers now use Pro Tools software to “correct” the expressive slides on and off precise pitch. Many producers use the pitch-correction software to center the voice directly onto a note’s mathematically determined frequency, and adjust the timing of every word so that the syllables fall, with metronomic exactitude, precisely on the beat. Robert Warsh, a singer friend of mine with an especially expressive and soulful voice, told me why the widespread use of Pro Tools has been so disastrous for so many genres of music. “It removes the human moment when the voice travels to the note it’s seeking. I believe the listener registers, unconsciously, this natural ‘pitch-finding’—and it’s one of the things that attracts us to a voice. Compare a Sam Cooke song to one from Taylor Swift, and this will say it all.”4

  Swift provides a particularly salient example, perhaps, because her natural tendency toward a certain wobbliness of pitch is what gave her voice its charm and individuality in her early incarnation as an ingénue C&W singer. Her 2009 rendition on David Letterman’s show of the superbly written “Fearless” (about a young woman plunging into her first romance) benefited from her obvious lack of vocal virtuosity, the slides on and off pitch, the inexpertly timed inhalations. Having since mutated into an arena-filling singer of machinelike, propulsive dance pop, Taylor’s voice is today invariably autotuned and otherwise augmented, both in studio and live, to fit the robotically precise music. What is gained in sonic precision, immediacy, and commercial popularity is lost in the delicate emotionalism of her actual, appealingly imperfect, voice.

  The Seashore and Metfessel study found that one property in particular epitomizes the artful “imprecision” that is the basis for all emotional expression in the voice: vibrato, a “flutter in pitch,” a wavering in the fundamental frequency of the phonating vocal cords of a semitone interval (the difference between a white key on the piano and the black key adjacent), at a rate of five to seven oscillations per second, with a simultaneous alternation in volume between loud and soft.5 That description sounds forbiddingly technical, but you actually know vibrato very well—you hear it every time you detect the quivering pulsation that singers like Ariana Grande or Beyoncé employ to give a plangent yearning to a long-drawn-out note, often at the end of a phrase. The brain converts the ambiguously pitched pair of notes into a single tone whose rapid movement between two frequencies mysteriously brings a lump to your throat or raises the hair on your arms.6

  More than eighty years after Seashore and Metfessel’s pioneering research, scientists are still debating the physiological origins, and psychological effects, of the vibrato. They have confirmed the authors’ assertion that it occurs in all singing styles; you hear it in folk and pop, country and opera, hip-hop and death metal.7 But it is used not only in singing: every time you see and hear a classical violinist wiggle her finger on the fret board while bowing a note, or see a grimacing rock star perform the same finger-shaking gesture high up on the neck of his guitar to coax a few extra quivers of emotion from a solo, you are witnessing the vibrato. Trumpeters, clarinetists, trombonists, flautists use their lips in combination with the instrument’s valves to create an identical fluttering of pitch and volume between two semitones—and at precisely the rate that Seashore and Metfessel first identified in the singing of African American folk singers of the 1920s: that is, between five to seven oscillations per second.

  Seashore, in a later book exclusively about the vibrato,8 called it “the most important of all the musical ornaments,” and posited that its universality in music reflects its origins in the original musical instrument: the voice, where vibrato is a natural reflex of the vocal muscles when holding a sustained note. Great singers modulate its speed and duration through controlled movements of the laryngeal cartilages (shifting them back and forth like the guitarist’s w
iggling finger); rapid changes in respiration (pulsing movements of the diaphragm to vary the air pressure from the lungs and thus stress the periodic changes in volume from loud to soft that accompany the pitch changes); along with coordinated movements of the tongue, velum, and jaw that further shape the vibrato into its characteristic two-tone ululation. However, if the oscillation is too fast—that is, if it exceeds the seven oscillations per second that Seashore identified as the upper limit for an expressive vibrato—it creates an unpleasant effect that singers call “bleat”; too slow and you get the dreaded “wobble.” Overextending the vibrato on a note, or using it on every note—troweling on an impasto of affected or mannered emotion—creates schmaltz. Some strict music aficionados insist that when the vibrato is executed properly a listener doesn’t even consciously hear it, she merely responds with a powerful upwelling of emotion.9 Why remains mysterious. Seashore thought the answer lay in the vibrato’s evolutionary history. As he was the first to point out, all birds and nonhuman mammals produce the vibrato (“it is present… in the singing of the canary,” he wrote, “in the bark of the dog”), and at the same rate as in humans, five to seven oscillations per second, in a semitone interval. It is likewise heard in our human laughter and sobbing.

  No voice scientist, to my knowledge, has offered an explanation for the adaptive origins of this “phenomenon of sonance,” as Seashore calls the vibrato.10 I suspect the answer lies in Eugene Morton’s observations (derived from Darwin’s “principle of antithesis”) about how all animals blend, at moments of indecision, the low pitch of aggression with the high pitch of submission. For the vibrato is, after all, a series of rapid up-and-down frequency changes across the held note. Seashore observed that vibrato does not evoke particular emotions—“we cannot distinguish feelings of love from hatred, attraction from repulsion, excitement from tranquility, by the vibrato,” he wrote.11 But perhaps this very ambiguity, like the “indecision” Morton detected in the animal voices he studied, is the basis of the “emotional instability” that Seashore said is the hallmark of the vibrato’s universal, goose-bump-inducing power.

 

‹ Prev