far beneath the cheering crowds, where planners have taken
advantage of the stadium’s deep foundations to create a stable
environment for the university’s world-leading mirror-making
laboratory. This is the domain of Roger Angel, a now ageing hip-
pie who combines the sharpest of scientific insight with a crafts-
man’s flair and love of tools. In the 1990s, Roger realized and
demonstrated how to make enormous mirrors which were
nearly hollow, supported by a honeycomb structure and thus
much lighter and easier to manoeuvre than would otherwise be
the case. They’ve become the de facto standard for large mono-
lithic mirror telescopes. The only alternative, utilized by the next
generation of extremely large telescopes, is to use a mirror that
comprises multiple, usually hexagonal, segments, but where
possible the simple charms of a single mirror still hold sway.
How Science iS Done 33
When I first visited the mirror lab under the stadium, a typical
example of its products was laid out on the floor. Like most of
Roger’s large mirrors, it was 8.4 metres across, a size dictated
not by scientific need or even by cost, but by the maximum size
that can be easily transported on American highways. I got to
scramble on top of it, and tried hard to imagine it at the heart of
an enormous telescope swinging around the sky. Since I was
there, that mirror has had its turn on the enormous polishing
machine, which ground it slowly to the correct shape with
almost unbelievable accuracy by the careful application of a
black goo called pitch in a process that, degree of mechaniza-
tion aside, hasn’t changed much since Newton’s day. The pro-
cess took months, but at the end of it the main mirror for the
telescope that will drive astronomy’s new data deluge was
ready.
The telescope in question is known, somewhat clumsily, as the
Large Synoptic Survey Telescope (LSST). Large it certainly is;
with its giant mirror, it can compete with the largest telescopes
in the world today. The key word, though, is synoptic. The plan is
for the telescope to complete a general survey, scanning the
whole sky available to it on average once every three nights,
making a movie of the sky. Among the thirty terabytes of data it
will produce every night will be discoveries of asteroids whip-
ping around the Solar System, the signs of stellar death in the
form of supernovae, and the flickering of galaxies as material
falls irretrievably down to their central black holes. Construction
of the telescope is now underway, yet astronomers including
myself are still struggling to get our heads around the sheer size
of LSST data. Even if, for example, you decide you only care about
things that change from night to night, you should expect a con-
servative estimate of a million alerts a night. Filtering that list of events to find those worthy of our attention is essential for
34 How Science iS Done
LSST science, but understanding how to do that well requires a
research programme of its own.
The LSST telescope is just a few years away, but we can already
see even greater challenges on the horizon. The next big inter-
national project in astronomy is a radio telescope, known as the
Square Kilometre Array (SKA). Rather than being a single mono-
lithic structure, the SKA will span two continents, scattering sen-
sitive radio receivers throughout the emptiest parts of southern
Africa and western Australia. Away from the noisy trappings of
civilization (and especially pleased to be free of interference from
mobile phones), the SKA will listen to the cosmos with a sensitiv-
ity never before achieved. The telescope will be so powerful that
there are serious worries that attempts to observe nearby sources
with it will be swamped by the presence of millions of previously
undetected background galaxies, and serious consideration is
being given to the feasibility of finding alien airport radar on
nearby planets.*
It’s the volume of data that matters, though, and here it’s hard
to find a proper comparison. I could tell you that the SKA will
provide as much information in its first week as exists in the five
million million million words that have been uttered during the
history of humanity. It is certainly an impressive statistic, with
the additional advantage of being true, but I’m not sure it helps
one really get a grip on what’s going on. Does it help to know
that the total data rate flowing between dishes will amount to ten
* Initially thought to be a promising source of signals for SKA-era SETI (the search for extraterrestrial intelligence), the consensus seems to be that the fact that we don’t know the rotation rate of the planets involved will stymie any serious search. It seems it may be simply too hard a task for us to expect to pick out an unknown signal without knowing when its host planet will be positioned just right for us on Earth to intercept a signal meant for the incoming space plane from Alpha Centauri. Still, the fact that this is even worth arguing about gives you an idea of quite how sensitive this new telescope will be.
How Science iS Done 35
times the current traffic on the internet? I’m not sure, but take
my word for it: SKA is a project that will live and die on its ability to handle large data sets, and this vulnerability is not confined to
astronomy. Whether you’re an oceanographer contemplating
data flowing from a new generation of Earth-observation satel-
lites or an ecologist carpeting your study area with motion sensi-
tive cameras, you’re going to spend a large part of the next decade
thinking about data processing.
Of course, scientists have been here before. It’s the largest pro-
jects, whether the Human Genome Project or the LHC at CERN,
that have had to confront the data deluge first. The world wide
web was built as a way of sharing information produced by the
latter, but the real action happens deep underground in the
experimental cavities within which precision engineering brings
the beams of particles, travelling in opposite directions around the
27-kilometre-long tunnel at much more than 99 per cent of the
speed of light, together to collide.
At the instant particles collide at these sorts of velocities a
tremendous amount of energy is released, creating conditions
not seen since the first tiny fraction of a second after the Big
Bang.* Most of that energy quickly results in the formation of
new particles which fly outwards from the point of collision.
Many of the new particles are unstable, and so decay into fur-
ther particles, creating a complicated cascade of debris. It’s this
shower of particles, some created in the collision and some the
* This statement of course ignores the possibility of alien particle physicists who may have built colliders greater than our own. This is perhaps unfair; any civilization which has grown to at least our puny technological civilization’s level has presumably its share of creatures with the same love of banging nature together to see what it’s made of that characterizes the Earthly experimental physicist. Still, if they are out there t
hey don’t publish in our journals, which is all that really counts.
36 How Science iS Done
result of subsequent decay, that crash into the successive layers
of detectors wrapped around the collider beam and are picked
up by the carefully calibrated instruments. One layer might be
designed to deflect and thus measure the properties of particles
with positive or negative charge, while a final layer might be a
calorimeter designed to absorb the energy of particles that
make it that far.
By piecing together what each of these detectors find, the sci-
entists can work out what happened in the short time after the
collision. When, on 4 July 2012, researchers from two of the
experiments at the LHC, ATLAS and CMS, announced that they
had evidence for the elusive Higgs boson—what they had actu-
ally seen was a repeating pattern of several different cascades of
particles which corresponded to what was expected if the Higgs
had (briefly) been created. There is no box of bosons in the CERN
visitor centre, but the evidence for its existence had been piling
up in collision after collision provided by the LHC’s collider team
to the eager and waiting physicists.
But how did they find those tell-tale signatures in the data?
Most events do not produce Higgs bosons. Indeed, the produc-
tion of such a particle is enormously rare, but luckily by 2012,
over 300 trillion (300,000,000,000,000) collisions had been
recorded. That breaks down to a rate of around 600 million col-
lisions a second, or 300 gigabytes a second of data, the equivalent
of having the entirety of English Wikipedia read to you seven
times each and every second. Were you subject to such a cacoph-
ony, I suspect you’d reach for the same solution as CERN’s
scientists. They filter the data they receive from the collider’s
experiments, throwing out much of it almost instantly and keep-
ing only those events which match a predefined set of triggers.
Anything corresponding to a Higgs event, for example, would be
snarfled, saved for future Nobel Prize-winning analysis, but more
How Science iS Done 37
than 99.999 per cent of the data collected by the LHC is discarded
within a second or so of being received.
The LHC, though, has never been just a Higgs-seeking machine.
Plenty of other experiments are underway, each with their own
set of triggers to snatch information from the flow of live events.
One of the most exciting for those of an astronomical bent is the
search for dark matter, and this is a little different. Dark matter is, we think, the stuff the Universe is made of. All of these atoms, all
of these protons, electrons, and neutrons, all of these neutrinos,
muons, and more amount to only so much scum floating on a
sea of dark matter. It accounts for about 80 per cent of the matter
in the Universe, and the embarrassing fact is that we don’t know
what it is.
Help is, however, on hand. We have good evidence that what-
ever dark matter is, it behaves as if it is composed of massive neu-
tral particles. You might think of a sea of particles, each with the
mass of the nucleus of a copper atom, but neutrally charged so
that it can’t interact with light. (Such particles are known as
WIMPs; weakly interacting massive particles.) If this explanation
is accurate, then it seems possible that dark matter particles will
be produced in some fraction of the collisions at the LHC. They
would likely shun the embrace of both ATLAS and CMS detectors,
fly straight through, and thus show up as a loss of energy in the
experiment.
That missing energy would be hugely exciting, for it would
mean that an unexpected particle, whether or not it turns out to
be responsible for dark matter, was being created within the col-
lider. Knowing how much energy was missing (and thus the
energy needed to create such a particle) would allow physicists
to focus their search and start to pin down its properties. It’s pos-
sible, though, that the LHC would already have the data that’s
needed; if the particles sometimes weakly interact with the
38 How Science iS Done
detectors then in the morass of previously discarded data should
be nuggets of gold.* More likely though, if the LHC is producing
(as we all hope it will!) some truly unexpected physics, the years
of prior experimental runs will count for nearly nothing. If the
triggers weren’t set to collect the right type of data—if the evi-
dence for dark matter interactions or whatever else has been
thrown out with the junk—then there’s nothing for it than to
reset the triggers and run the experiment again.
This isn’t supposed to be a criticism of the LHC. The truth is
that their data rate is so extreme, and our ability to capture, store, and sort data so puny in the face of such an onslaught, that they
really have no choice but to throw out much of what is produced.
They’ve also set triggers that might catch likely dark matter can-
didates, but I like thinking about CERN’s struggles to do the right
thing because they make clear the complexity of modern sci-
ence, and the decisions that we have to confront when dealing
with large data sets. We are a long way from simple experiments
with one variable changed each time and into the realms of big
computation—despite the hype, we have become reliant on big
data.
Yet not all responses to overwhelmingly large data sets need
be so alien. Sometimes, the solution is not to reinvent the pro-
cess of discovery, but instead to look at what scientists have been
doing for years. It’s just that with more data, you need more sci-
entists. And that, dear reader, is where you come in. You, and
everyone you know.
* I don’t mean that literally, although when the LHC is not colliding protons it collides lead nuclei, and this has probably produced a residue of gold, albeit not at an economically viable price. It’s still nice for modern physics to have fulfilled the dreams of alchemists through the ages, though.
2
THE CROWD AND
THE COSMOS
When did humanity become aware of the Universe? Not of
outer space itself, and not only of the stars that speckle our
local neighbourhood, but of the whole kit and caboodle, the
potentially infinite realm that stretches out for billions of light
years in every direction? You can make a case, I think, that it was
when we first discovered galaxies, or rather when we found that
these often enigmatic objects were in fact immense systems of
hundreds of billions of stars.
Look at a galaxy through a small telescope, and you won’t see
any stars. You won’t, in fact, see much of anything, just a misty
patch of diffuse light. Only when you realize that that light is
generated by a vast number of stars, each too distant to be
resolved, does the true distance to these objects become appar-
ent. They are revealed as what used to be known as ‘island uni-
verses’, individual travellers separated by vast oceans of empty
&
nbsp; space. A few centuries after being displaced from the centre of
the Universe by the Copernican revolution, discovering that their
galaxy, the Milky Way, was nothing special, dealt the denizens of
Earth another blow to their collective ego.
40 The Crowd and The Cosmos
That might be depressing, or the vast scale of the stage on
which the Universal drama is played out might inspire you. In
either case, these discoveries were only the start of astronomers’
attempts to understand the formation and evolution of the galax-
ies. Our attempts to understand how we ended up with the
Universe we see around us, and in particular how the galaxies we
see got to be the way they are, are driving some of the most ambi-
tious and exciting projects in astronomy. It’s been that way for a
while, and back in July 2007 I found myself listening to the latest
arguments at a conference in Piccadilly organized by the Royal
Astronomical Society.
I’m not sure what images the mention of a scientific confer-
ence will conjure up. Maybe a bearded Russian theorist mum-
bling nearly incomprehensibly at a chalky blackboard? Maybe
a whole gamut of scruffy academics engaging in hand-waving
debate about obscure and incomprehensible points? The latter’s
about right, at least for the conferences I go to. Ignore any
thoughts about calm and considered discussion; the atmosphere
is often more febrile than you might imagine, but the rudest of all
are not those indulging in backbiting and snark, but rather those
of us who disappear into our laptops, barely conscious of being
in a lecture hall at all. We’ll have fought for the few seats with
power sockets, look up from our iDevices only occasionally, and
will—at larger conferences—know to sit on the edges of vast
hotel ballrooms so that we can plug in.
To hide from the speaker and get on with work you could be
doing at home is rude, and distracting, but I’m as bad, if not
worse, than most, and on that sunny July morning you’d have
found me prodding at a laptop, grumpy in the middle of the back
row in a cramped and already sweltering lecture theatre. It must
have taken almost two or three slides from the first speaker to set
my mind wandering and the search for viable Wi-Fi to begin.
The Crowd and the Cosmos: Adventures in the Zooniverse Page 6