Another benefit of phylogenetic analysis is that we can track transmission in the final stages of an outbreak. In March 2016, a new cluster of Ebola cases appeared in Guinea, three months after who had declared the West Africa epidemic over. Perhaps the virus had been spreading undetected in humans all along? When epidemiologist Boubacar Diallo and his collaborators sequenced viruses from the new cluster of cases, they hit upon an alternative explanation. The new viruses were closely related to an Ebola virus found in the semen of a local man who’d recovered from the disease back in 2014. The virus had persisted in his body for almost a year-and-a-half, before spreading to a sexual partner and sparking a new outbreak.[9]
Sequence data is becoming an important part of outbreak analysis, but the idea of evolving viruses can sometimes lead to alarmist coverage. During the Ebola and Zika epidemics, several media reports played up the fact that the viruses were evolving.[10] But this isn’t necessarily as bad as it sounds: all viruses evolve, in the sense that their genetic sequence gradually changes over time. Occasionally this evolution will lead to a difference we care about – like the flu virus changing its appearance – but often it will just happen in the background without having a noticeable effect on an outbreak.
The rate of evolution can affect our ability to analyse outbreaks, though. Phylogenetic analysis is more effective when looking at pathogens that evolve fairly quickly, like hiv and flu. This is because the genetic sequence will change as pathogens spread from one person to another, making it possible to estimate the likely path of infection. In contrast, viruses like measles evolve slowly, which means there won’t be much variation from one person to another.[11] As a result, working out how the cases are related is a bit like trying to piece together a human family tree in a country where everyone has the same surname.
As well as biological limitations to phylogenetic methods, there are also practical ones. In the early stages of the West Africa Ebola epidemic, Pardis Sabeti, a geneticist with the Broad Institute in Boston, analysed sequence data from ninety-nine viruses from Sierra Leone. Phylogenetic trees showed that the infection had spread from Guinea to Sierra Leone in May 2014, possibly after a funeral. Given the seriousness of the outbreak, Sabeti and her colleagues quickly added the new genetic sequences to a public database. This initial burst of research was then followed by a period of relative silence. Although several other teams had been collecting virus samples, nobody else released any new genetic sequences between 2 August and 9 November 2014. During this same period, there were over 10,000 Ebola cases reported in West Africa, with the epidemic reaching its peak in October.[12]
There are a couple of possible reasons for the delay in releasing sequences. The cynical explanation is that new data are valuable academic currency. Research papers using genetic sequences to study outbreaks are likely to get published in coveted scientific journals, which creates an incentive for researchers to sit on potentially important data. However, based on my interactions with researchers during this period, I’d like to think it was mostly a matter of obliviousness rather than malicious intentions. Scientific culture just wasn’t adapted for outbreak timelines. Researchers are used to developing protocols, performing thorough analysis, writing up their methods, submitting the results to be peer-reviewed by fellow scientists. This process can take months – if not years – and has historically slowed the release of new data.
Such delays are a problem across science and medicine. When Jeremy Farrar took over as director of the Wellcome Trust in March 2014, he told The Guardian that clinical research often took too long, something that became apparent in the following months as the Ebola outbreak grew. ‘The systems we have got in place are not fit for purpose when the situation is moving quickly,’ Farrar said. ‘We have nothing that enables us to respond in real time.’[13]
This culture is gradually changing. In mid-2018, what would become another major Ebola outbreak began in the Democratic Republic of the Congo. This time, researchers were quick to release new sequence data. Teams also launched a clinical trial of four experimental treatments. By August 2019, they’d shown that a prompt infusion of anti-Ebola immune cells could increase someone’s chances of survival to over 90 per cent, up from a historical average of around 30 per cent. Meanwhile, outbreak scientists are increasingly posting draft papers on websites like bioRxiv and medRxiv, which aim to make new research accessible before it undergoes peer-review.[14]
During her time working in Sierra Leone, Sabeti discovered that the word for Kenema, the city where they were based, meant ‘clear like a river, translucent and open to the public gaze’.[15] This openness was reflected in her team’s work, with those ninety-nine sequences shared early in the outbreak. The attitude has also taken hold among the wider community of outbreak researchers. One of the best examples is the Nextstrain project, pioneered by computational biologists Trevor Bedford and Richard Neher. This online platform automatically collates genetic sequences to show how different viruses are related and where they might have come from. Although Bedford and Neher initially focused on flu, the platform now tracks everything from Zika to tuberculosis.[16] Nextstrain has proved to be a powerful idea, not just because it brings together and visualises all the available sequences, but because it’s separate from the slow and competitive process of publishing scientific papers.
As it becomes easier to sequence pathogens, phylogenetic methods will continue to improve our understanding of disease outbreaks. They will help us discover when infections first sparked, how outbreaks grew, and what parts of a transmission process we might have missed. The methods also illustrate a wider trend in outbreak analysis: the ability to combine new data sources to get at information that has traditionally been hard to come by. With phylogenetics, we can uncover the spread of outbreaks by linking patient information with the genetic data of the viruses that infected them. These kinds of ‘data linkage’ approaches are becoming a powerful way of understanding how things mutate and spread in a population. But they aren’t always being used in the ways we might expect.
goldilocks was a dishonest, foul-mouthed old woman who burgled a trio of well-meaning bears. At least, she was when poet Robert Southey first published the story in 1837. After swearing her way through three bowls of porridge and breaking a chair, the woman heard the bears come home and made her escape through a window. Southey didn’t give her a name or golden hair; those details would come decades later, as the villainous woman evolved into a troublesome child and finally the Goldilocks most of us know today.[17]
The tale of the bears has been around for a long time. A few years before Southey published his story, a woman named Eleanor Mure had written a homemade book for her nephew. This time the bears caught the old woman at the end of the story. Angry at the damage, the bears set her on fire, tried to drown her, and then impaled her on the steeple of St Paul’s Cathedral. In an earlier folk story, three bears saw off a mischievous fox.
According to Jamie Tehrani, an anthropologist at Durham University, we can think of culture as information that mutates as it gets transmitted from person-to-person and generation-to-generation. If we want to understand the spread and evolution of culture, folk stories are therefore useful because they are the product of their society. ‘By definition, folktales don’t have a single authoritative version,’ said Tehrani. ‘They are stories that belong to everybody in the community. They have this organic quality.’[18]
Tehrani’s work on folktales started with ‘Little Red Riding Hood’. If you live in Western Europe, you’re probably familiar with the tale as told by the Brothers Grimm in the nineteenth century: a girl visits her grandmother’s house, only to be met by a wolf in disguise. However, this isn’t the only version of the story. There are several other folk tales out there that bear similarities to ‘Little Red Riding Hood’. In Eastern Europe and the Middle East, people tell the story of ‘The Wolf and the Kids’: a disguised wolf tricks a group of baby goats into letting him into their house. In East Asia, there
is the tale of ‘The Tiger Grandmother’, in which a group of children encounter a tiger that pretends to be their elderly relative.
The tale has spread across the world, but it’s difficult to tell in which direction. A common theory among historians is that the East Asian version was the original, with the European and Middle Eastern stories coming later. But did ‘Little Red Riding Hood’ and ‘The Wolf and the Kids’ really evolve from ‘The Tiger Grandmother’? Folktales have historically been spoken rather than written down, which means historical records are shallow and patchy. It’s often not clear exactly when and where a particular story originated.
This is where phylogenetic approaches can come in useful. To investigate the evolution of ‘Little Red Riding Hood’ and its variants, Tehrani gathered together almost sixty different versions of the story, spanning multiple continents. In place of a genetic sequence, he summarised each story based on a set of seventy-two plot features, such as the type of lead character, the trick used to deceive them, and how the story ended. He then estimated how these features evolved, resulting in a phylogenetic tree that mapped the relationship between the stories.[19] His analysis would produce an unexpected conclusion: based on the phylogenetic tree, it seemed that ‘The Wolf and the Kids’ and ‘Little Red Riding Hood’ had come first. Contrary to common belief, ‘The Tiger Grandmother’ was apparently a blend of existing tales, rather than being the original version from which others evolved.
Evolutionary thinking has a long history in the study of language and culture. Decades before Darwin drew his tree of life, linguist William Jones had been interested in how languages emerge, a field known as ‘philology’. In 1786, Jones noted the similarities between Greek, Sanskrit, and Latin: ‘no philologer could examine them all three, without believing them to have sprung from some common source, which, perhaps, no longer exists.’[20] In evolutionary terms, he was suggesting that these languages had evolved from a single common ancestor. Jones’s ideas would later influence many other scholars, including the Brothers Grimm, who were keen linguists. As well as collecting together different variants of folktales, they tried to study how the use of language had changed over time.[21]
Modern phylogenetic methods make it possible to analyse the evolution of such stories in much more detail. After studying ‘Little Red Riding Hood’, Jamie Tehrani worked with Sara Graça da Silva at the University of Lisbon to examine a much wider range of stories, tracing the evolution of 275 folktales in total. The pair found that some tales have a long history; stories such as ‘Rumplestiltskin’ and ‘Beauty and the Beast’ may have originally emerged over 4,000 years ago. This would mean they are as old as the Indo-European languages through which they spread. Although many folktales eventually travelled widely, da Silva and Tehrani also found traces of local rivalry in storytelling. ‘Spatial proximity appears to have had a negative effect on the tales’ distributions,’ they noted, ‘suggesting that societies were more likely to reject than adopt these stories from their neighbours.’[22]
Folktales are often tied to a country’s identity, even if their origins are not. When the Brothers Grimm compiled their collection of traditional ‘German’ stories, they noticed that there were similarities with tales in many other cultures, from Indian to Arabic. Phylogenetic analysis confirms just how much story borrowing there has been. ‘There’s not a great deal that’s special about any one country’s oral tradition,’ Tehrani said. ‘In fact, they’re highly globalised.’
Why did humans start telling stories in the first place? One explanation is that tales help us preserve useful information. There’s evidence that storytelling is a highly valued skill in hunter-gatherer societies, leading to suggestions that stories took hold in the early stages of human history because good storytellers were more desirable as mates.[23] There are two competing theories about what sort of story-based information we have evolved to value. Some researchers suggest that stories relating to survival are most important: deep down, we want information about where food and dangers are. This would explain why tales that evoke reactions like disgust are memorable; we don’t want to poison ourselves. Others have argued that because social interactions dominate human life, socially relevant information is most useful. This would imply that we preferentially remember details about relationships and actions that break social norms.[24]
To test these two theories, Tehrani and his colleagues once ran an experiment looking at the spread of urban legends. Their study mimicked the children’s game of ‘broken telephone’: tales were passed from one person to another, then to another, with the final version showing how much was remembered. They found that stories containing survival or social information were more memorable than neutral stories, with the social stories outperforming the survival ones.
Other factors can also boost the success of stories. Earlier broken telephone experiments found that tales tend to become shorter and simpler as they spread: people remember the gist but forget the details. Surprises can help a tale as well. There’s evidence that tales are catchier if they include counter-intuitive ideas. However, there is a balance to be struck. Stories need some surprising features, but not too many. Successful folk tales generally have a lot of familiar elements, combined with a couple of absurd twists. Take Goldilocks, the story of a girl who explores the family home of a mother, father, and baby. The twist, of course, being that it’s a family of bears. This narrative trick also explains the attraction of conspiracy theories, which take real-life events and add an unexpected slant.[25]
Then there’s the structure of a story. Goldilocks’ popularity might not be down to her, but rather the three bears. They turn the story into a sequence of memorable triplets: the bowls of porridge are too hot, too cold, just right; the beds are too soft, too hard, just right. This rhetorical trick is known as the ‘rule of three’ and crops up regularly in politics, from the speeches of Abraham Lincoln to Barack Obama.[26] Why are lists of three so powerful? It might have something to do with the mathematical importance of triplets: in general, we need at least three items in a sequence to establish (or break) a pattern.[27]
Patterns can also help with the spread of individual words. As language evolves, new words often have to compete to displace already popular ones. In such situations, we might expect people to prefer words that follow consistent rules. For example, past tense verbs often end in ‘…ed’, so it makes sense that the historical word ‘smelt’ has made way for ‘smelled’, while ‘wove’ is gradually becoming ‘weaved’.[28]
Yet some words have evolved in the other direction. In the 1830s, people would have ‘lighted’ a candle; nowadays we’d talk of having lit one. Why did these irregular words outcompete popular ones? A group of biologists and linguists at the University of Pennsylvania reckon that rhyming might have had something to do with it. They noticed that in the mid-twentieth century, Americans started saying ‘dove’ instead of ‘dived’ as the past tense of ‘to dive’. Around the same time, newly popular cars were causing people to adopt words like ‘drive’ and ‘drove’. Similarly, people started using ‘lit’ and ‘quit’ instead of ‘lighted’ and ‘quitted’ during the period that ‘split’ became a popular way of saying you were going to leave.
There are two main ways that new words and stories can spread through a population. Either they pass down from generation-to-generation, perhaps picking up some variations along the way; this is known as ‘vertical transmission’. Alternatively, tales may blend across communities in the same generation, in a process of ‘horizontal transmission’. Da Silva and Tehrani have found that both types of transmission have influenced the spread of folktales, but for the majority of stories, the vertical route was more important. In other areas of life though, horizontal transmission can dominate. Creators of computer programs often reuse existing lines of code, perhaps because there’s a useful feature they need to include, or because they want to save time. In evolutionary terms, this means that computer code can ‘time travel’, with bits of old
programs or languages suddenly popping up in new ones.[29]
If sections of stories or computer code mix together within a single generation, it becomes difficult to draw a neat evolutionary tree. If a parent tells their child a traditional family story, then the child incorporates parts of their friends’ family stories, the new tale essentially fuses all these different branches of stories together. The same problem is well known to biologists. Take the 2009 ‘swine flu’ pandemic. The outbreak started when genes from four viruses – a bird flu virus, a human flu virus and two different swine flu strains – jumbled together inside an infected pig in Mexico, creating a new hybrid virus that then spread among humans.[30] One gene was closely related to other human flu viruses; another was similar to circulating bird flu strains; others were like swine viruses. And yet, taken as a whole, this new flu virus wasn’t really like anything else. Changes like these show the limitations of a simple tree metaphor. Although Darwin’s tree of life captures many features of evolution, the reality – with genes potentially passing within as well as between generations – is more like a bizarre, unkempt hedge.[31]
The processes of horizontal and vertical transmission can make a big difference to how traits spread through a population. In the waters of Shark Bay, just off the coast of Western Australia, a handful of bottlenose dolphins have started using tools to forage for food. Marine biologists first noticed the behaviour in 1984; dolphins were breaking off bits of marine sponge and wearing them as a protective mask while they rummaged for fish in the seabed. But not all dolphins in Shark Bay would go on to use ‘sponging’. Only around one in ten have picked up the technique.[32] Why hasn’t the behaviour spread further? Twenty years after biologists first observed sponging, a group of researchers used genetic data to show that the tactic was almost entirely the result of vertical transmission. Dolphins are famously social, but it seems that after one initial dolphin came up with the innovation, it only spread through their family line. Individuals who weren’t related to them kept on foraging sponge-free. In effect, this family of dolphins had created their own unique tradition.
The Rules of Contagion Page 24