An Appetite for Wonder
Page 19
(24th Nov 1985) The Chaplain bets Dr Ridley a bottle of claret that Dr Bennett will be wearing a clerical collar at dinner on the occasion of the visit of the Bishop of London. (Chaplain won.)
(4th August 1993) Mr Dawkins bets Mr Raine £1 that Bertrand Russell married Lady Ottoline Morrell. Adjudicator Mlle Bruneau. (Dawkins lost and paid, 20 years late.)
Bets like the last one can’t happen any more because it is so trivially easy for everyone to check such factual questions on their smartphones without rising from their Senior Common Room armchairs. Even then, it was scarcely necessary to appoint an adjudicator for a purely factual matter.
Back to 1970, when I was twenty-nine and newly returned to Oxford. The singing Elliott had gone the way of all silicon, but Moore’s Law and the research grant that had lured me back to Oxford the previous year made it possible for me to have my ‘own’ computer, a PDP-8, which exceeded the Elliott in every respect except physical size and price. Also in accordance with Moore’s Law (which was already going strong in those days), it was functionally much smaller yet physically larger than a modern laptop, and ludicrously it had a log book in which you were supposed to record every time you switched it on (of course I didn’t). It was my pride and joy and a valued resource – together with me as sole programmer for everybody in 13 Bevington Road (which took its toll on my time). Now my addiction to computers could really take off, and I no longer had to indulge it nocturnally, as during my shameful affair with the Elliott 803.
Previously I had used only high-level compiler languages – human-friendly languages, which the computer translates into its own binary machine language. But now, in order to use the PDP-8 as a research tool, I had to master its 12-bit machine language, a task into which I threw myself with zest. My first machine-code project was the ‘Dawkins Organ’, a system for recording animal behaviour – equivalent to George Barlow’s ‘Data Acquision’ apparatus but much much cheaper. The idea was to make a keyboard which an observer could use in the field, pressing buttons to indicate actions by an animal. Key-presses would be recorded on a tape recorder, which would later automatically tell the computer exactly when each action by the animal occurred.
My keyboard literally was a makeshift electronic organ, with each key playing a different note (inaudible except to the tape recorder). This part would be easy to make. The box would contain a simple two-transistor oscillator, the pitch of whose note was tuned by a resistance. Each key on the organ would connect a different resistor and hence play a different note. The observer was to take the organ into the field and watch an animal’s behaviour like a work-study officer, pressing a specific key for each behaviour pattern. A tape recording of the sequence of notes would then constitute a timed record of the animal’s behaviour. Theoretically, a person with a good ear listening to the tape could detect which key had been pressed, but this wouldn’t be helpful. I needed to cast the computer in the role of person-with-good-ear. It could have been done electronically, with a series of tuned frequency-detectors, but that would have been an expensive hassle. Could the same feat – perfect pitch sensitivity in the computer – be achieved in software alone?
I was discussing the problem with my computer guru of the time, Roger Abbott, a clever engineer (and coincidentally organist) employed on the large research grant of Professor Pringle. Roger came up with an inspired suggestion. Every musical note has a characteristic wavelength which signifies its pitch. Computers are – and were, even in those days – so fast that the interval between wave crests within a musical note could be measured in hundreds of program cycles. Roger suggested that I should write a machine-code program to time the intervals between wave peaks: write, in other words, a little routine to act as a high-speed clock, counting how many jumping-back program loops it could walk through before being interrupted by the next wave crest (which, when averaged over lots of wave crests, tells it the pitch of the note). When a note ended (when more than a critical time elapsed since the last wave peak) the computer should make a note of the time and then wait for the next organ note. The computer’s clocking loop, in other words, would be used not only to recognize the pitch of a musical note but also, on a hugely longer timescale, to measure the passage of time between notes.
Having got this central routine working, the rest was just a matter of slogging through the writing and debugging of a user-friendly program. This took rather a long time, but it ended successfully. The Dawkins Organ was a viable product. The user of the organ began each session by playing a scale on the tape – all the notes on the organ in ascending order of pitch. The taped scale would then be used to ‘calibrate’ the software – ‘teach’ the computer the repertoire of notes it would be asked to recognize. After the calibration scale was ended (by hitting the first note for the second time), all further notes on the tape would designate behavioural events. This calibration system had the advantage that the organ did not have to be carefully tuned. Any set of notes that were sufficiently distinct from each other would do, because the computer quickly learned which notes to listen out for.
So, when the tape was brought home and played into the computer, the computer knew exactly what the animal had done, and when. The nucleus of the program was the timing loop, but it was embedded in a substantial quantity of code to punch out, on paper tape, the names of all the behaviour patterns and the exact times when they occurred.
I published a paper on the Dawkins Organ,53 and made the software available free of charge. Over the next few years Dawkins Organs were used by numerous members of the Oxford ABRG, and by some ethologists elsewhere in the world, for example in the University of British Columbia.
My addiction to machine code programming took me in a downward spiral. I even devised my own programming language, BEVPAL, with its own programming manual, a somewhat otiose exercise since the language was used by nobody except myself and, briefly, Mike Cullen. Douglas Adams amusingly satirized computer addiction of exactly the kind that hit me. The target of his satire was the programmer who had a particular problem X, which needed solving. He could have written a program in five minutes to solve X and then got on and used his solution. But instead of just doing that, he spent days and weeks writing a more general program that could be used by anybody at any time to solve all similar problems of the general class of X. The fascination lies in the generality and in the purveying of an aesthetically pleasing, user-friendly product for the benefit of a population of hypothetical and very probably non-existent users – not in actually finding the answer to the particular problem X. Another symptom of this kind of geekish addiction is that every time you solve a local problem and make the computer jump through yet another hoop, you want to rush out into the street and drag someone in to show them how elegant it is.
The productive camaraderie that a small building like 13 Bevington Road fosters came to an end around this time, and the animal behaviour group moved to the new zoology/psychology building, the huge, battleship-like horror on South Parks Road, then informally known as HMS Pringle after the ambitious Linacre Professor who persuaded the university authorities to build it – having failed to cajole them into building a pencil-thin skyscraper that would have disastrously overtopped Matthew Arnold’s dreaming spires. I have mixed feelings about my part in later getting HMS Pringle officially named the Tinbergen Building, for it is widely deplored as the ugliest building in Oxford. It won an architectural award from the Concrete Society – enough said.
Around this time I published a short paper in Nature.54 Every day hundreds of thousands of our brain cells die, and this was upsetting to me even at the age of twenty-nine. My Darwin-obsessed brain sought comfort in the idea that if the cell deaths were non-random, such apparently wholesale slaughter might be constructive, not purely destructive:
A sculptor changes a homogeneous lump of rock into a complex statue by subtraction, not addition, of material. An electronic data processing machine is most likely to be made by connecting components up in complex ways, and
then enriching the connexions to make it even more complex. On the other hand, it could be constructed by starting with extremely rich, even random interconnexions, and then carving out a more meaningful organization by selectively cutting wires.
. . .
The theory proposed here may seem fanciful at first. Further reflexion shows, however, that its lack of verisimilitude is mainly a consequence of the highly improbable postulate on which it rests; namely, that brain cells are decreasing in numbers at a prodigious rate daily. Because this postulate, however far-fetched, is an established fact, the present theory is not suggesting anything very implausible in addition; rather the reverse, as it makes the process seem less wasteful. All that is at issue is whether neurones die at random, or selectively in such a way as to store information.
A curious little one-off, this paper is perhaps mildly interesting as an early example of the kind of theory that later became fashionable under the name – coined a year later, so I obviously didn’t use it – ‘apoptosis’.
Marian soon got her doctorate, and we began to collaborate on research projects growing out of the many discussions – mutual tutorials – from our Berkeley days. We planned a study that would exemplify, and clarify, one of the fundamental concepts of the ethological school of animal behaviour studies, the Fixed Action Pattern.
Lorenz and Tinbergen and their school thought that much of animal behaviour consisted of a sequence of little clockwork routines – Fixed Action Patterns (FAP). Each FAP was thought to be like a piece of anatomy, just as much a part of the animal’s bodily equipment as, say, the collar bone or the left kidney. The difference is that collar bones and kidneys are made of solid material, whereas the FAP has a time dimension: you can’t pick it up and put it in a drawer, you have to watch it play out in time. A familiar example of an FAP would be the pushing movement that a dog makes with its snout when burying a bone. These movements are identically replayed even when the bone is on a carpet and there is no soil in which to bury it. The dog really does look like a (charming) clockwork toy, although the exact direction of the movement is influenced by the position of the bone.
Every animal has a repertoire of FAPs, like one of those dolls that you wind up by pulling a string and which then utters a saying randomly plucked from a finite repertoire. Once initiated, whichever saying is chosen goes through to completion. The doll doesn’t switch messages halfway through. The decision of which of a dozen sayings to produce is unpredictable but, once taken, the consequences of the decision are followed through predictably as clockwork. That was the FAP doctrine in which Marian and I, as Tinbergenian ethologists, had been brought up; but was it a true reflection of reality? This was the question we wanted to answer – or, to be precise, the question we sought to re-express in terms that might make it answerable.
In theory, one could write down the continuous stream of animal behaviour as a sequence of muscular contractions. But if the FAP theory was right, the predictability of behaviour would render it a laborious waste of effort to write down every muscular contraction, even were it possible to do so. Instead, all we should need to do is write down the FAPs, and the sequence of FAPs would – on an extreme interpretation – be a complete description of that particular animal’s behaviour.
But this would only work if FAPs really were equivalent to organs or bones – if it were true, in other words, that each pattern occurs as a whole, not breaking off part-way through or mixing with another pattern. Marian and I wanted to find a way to assess the extent to which this proposition was true. Both our doctoral theses had been concerned – in our two ways – with decision-making, and it was natural for us to translate the FAP problem into the language of decisions. In that language, the animal takes a decision to initiate an FAP; but, once initiated, the FAP goes on to its conclusion, with no further decisions until the end. At that point the animal’s behaviour stream would enter a period of uncertainty, pending the next decision to initiate (and complete) an FAP.
We chose to study drinking in chicks as our example, and we hoped it would be representative.55 Drinking in birds (other than pigeons and doves, which suck) is an elegant glissando of a movement, and it certainly gives a subjective impression of being initiated by a discrete decision, after which it always goes through to completion. But could we back up our subjective impression with hard data?
We filmed a side view of our chicks drinking, and then analysed the behaviour frame by frame to see if we could measure its ‘decision structure’. We measured the position of the bird’s head in successive frames of film, then fed the coordinates into the computer. The idea was to measure the predictability of the next frame, knowing the position of the head in previous frames.
The diagram is a graph of eye height against time, for three drinks by the same chick, lined up (zero on the time axis) on the moment when the bill hit the water. You get a sense that from that moment on, indeed from just before it, the behaviour is stereotyped and predictable, but the early part of the downstroke is more variable and subject to decisions: decisions to pause and even (as we showed separately) to abort the drink.
But how should we measure predictability? The graph below shows one way. It represents a single drink in the same way as before. But each point on the graph of eye position has arrows attached to it. The length of the arrow signifies, for each frame of film, the likelihood (as totted up over all the drinks by all the chicks) that the eye height in the next frame will be lower, higher, or the same.
You can see that during the upstroke, when the bird is allowing the water to trickle down its throat, there is a high probability that the upstroke will continue its graceful curve in the upward direction. A decision to perform an FAP is being carried out, with no further decisions during its course. But during the downstroke there is more unpredictability. For each frame of the downstroke, the height of the eye in the next frame is undecided between lower or the same, and there is even some likelihood that it might be higher – that is, that the drink will be aborted.
Could we use these arrows to compute an index of uncertainty or ‘decisioniness’? The index we chose was based on information theory, devised in the 1940s by the inventive American engineer Claude Shannon. The information content of a message can be informally defined as its ‘surprise value’. Surprise value is a convenient opposite of predictability. Classic examples are ‘It is raining in England today’ (low information content because no surprise) versus ‘It is raining in the Sahara Desert’ (high information content because surprising). For reasons of mathematical convenience, Shannon computed his index of information content in bits (short for ‘binary digits’), by summing up the logarithm (base 2) of the prior probabilities that were open to doubt before the message was received. The information content of a penny toss is one bit, because the prior uncertainty is heads or tails – two equiprobable alternatives. The information content of a playing card’s suit is two bits (there are four equiprobable alternatives and the base two logarithm of four is two, corresponding to the minimum number of yes/no questions you’d need to ask in order to establish the suit). Most real examples are not so simple, and the possible outcomes are usually not equiprobable, but the principle is the same and a version of the same mathematical formula conveniently does the trick. It was this mathematical convenience that led us to use the Shannon Information Index as our measure of predictability or uncertainty.
Once again we have a graph of eye height against time during a drink. The thin lines represent times of low predictability, or high probability of a decision intervening to change the future. The thick black lines represent times of high predictability (information content less than an arbitrary threshold of 0.4 bits), during which a decision is being carried out and no new decision is expected. The upstroke is predictable once it starts, but the downstroke is not. The pause between drinks is predictable for the rather boring reason that the pause is most likely to continue into the next frame – it’s hard to predict when the next drink will sta
rt.
As always, keep in mind that the particular behaviour, here drinking, is not of interest in itself. Drinking in chicks was a stand-in for behaviour generally, just as pecking was in my doctoral research. We were interested in the very idea of a decision and – in the case of drinking – whether we could identify moments of decision. We were trying to explore a way to demonstrate the very existence of a Fixed Action Pattern, rather than simply taking it for granted as ethologists were wont to do.
We adopted a different approach in our next project on decision-making, which was a study of self-grooming in flies. Ethologists often ask whether, if you know what an animal is doing now, you can predict what it will do next. Marian and I wanted to know whether, sometimes, you can predict what it will do in the more distant future better than you can predict what it will do in the more immediate future. This might be true, for instance, if behaviour is organized like human language. There are times when the beginning of a sentence predicts how the sentence will end better than it predicts the middle – which might contain any number of embedded adjectival or relative clauses, for instance. ‘The girl hit the ball’ is a sentence whose beginning demands something like the ending, whether or not there are embedded adjectives or adverbs or clauses in the middle: ‘The girl with red hair, who lives next door, vigorously hit the ball.’
We didn’t find evidence of language-like grammatical structure in the grooming behaviour of flies (although see below). But we did find an interesting zigzag pattern in the way predictability decays over time: in other words, the immediate future may be less predictable than the (slightly) more distant future. I’ll outline our research briefly here and not in detail, because it’s a bit complicated.
Flies are not normally seen as beautiful, but the way they wash their faces and their feet is rather dear. Look next time a fly lands on you: you’ll very probably see the behaviour. It may rub its front feet together, or wipe its great big eyes with them. It may rub the middle foot on one side against the hind foot on the same side, or clean its abdomen or wings with its hind feet. Somewhere inside that tiny head, decisions are spontaneously being generated, and a fair number of those decisions concern which bit of the body to clean next. The appeal of self-grooming behaviour for us was that the fly’s choice of behaviour was unlikely to be externally stimulated. We presumed that external stimulation amounted to an ever-present need to keep clean – ever-present in the sense that, though important, it was unlikely to determine exactly when a particular grooming action would be chosen. Dirty wings would impair flight. Dirt would impair the highly sensitive tasting organs in the feet, which flies use to decide whether or not to stick out the tongue and eat. So cleaning is important. But presumably the decision about which bit to clean is not determined by the sudden arrival of a new piece of dirt. Rather, we suspected that these rapid, moment-to-moment decisions were internally generated by unseen fluctuations deep inside the nervous system.