Humble Pi

Page 23

by Matt Parker

And financial data is definitely not uniform. A lot of financial data conforms to Benford’s Law, which says that in some types of real-world data the lead digits are not equally likely. If lead digits were uniform, they would each appear around 11.1 per cent of the time. But, in reality, the chance of starting with, let’s say, a 1, depends on the range of numbers being used. Imagine you were measuring things up to 2 metres long in centimetres: 55.5 per cent of all the numbers from one to two hundred start with a 1. Imagine picking a date at random: 36.1 per cent of all dates have a day of the month starting with 1. Averaging across different distribution sizes means that, in a big enough set of appropriate data, around 30 per cent of numbers will start with a 1, down to only 4.6 per cent with a 9.

Expected Benford’s Law Distribution

Lead digits of populations of the 3,141 US counties (in 2000).

Real-world data tends to be amazingly close to this distribution – except when the numbers are made up. In one documented example a restaurant owner was making up their daily sales totals, evidently to reduce how much tax they would have to pay. But when the lead digits were plotted they were completely different to what Benford’s Law predicts. And even if the first digits of numbers follow Benford’s Law, often the last digits of the numbers are effectively random and should still be a uniform distribution. All two-digit combinations should appear 1 per cent of the time, but 6.6 per cent of the time the restaurant’s daily totals ended with 40. This wasn’t a quirk of their prices: the owner seemingly liked the number 40. As always, humans are terrible at being random. And it turns out restaurants are not very good at cooking books.

Lead digits on the left; final two digits on the right.

Benford’s Law also applies when looking at the first two digits of numbers, and this is one of the things forensic accountants look for. It’s hard to get real-world examples of this being used to spot tax fraud, and all the forensic accountants I’ve ever met have refused to be named or speak on the record. But there is some old data we can look at. Mark Nigrini is an associate professor at the West Virginia University College of Business and Economics and he analysed a dataset of 157,518 taxpayer records from 1978 which the Internal Revenue Service had anonymized and released. He looked at the first two digits of three different values people could declare on their tax returns:

Interest income amounts are the amount of interest people earn in a year and come from their bank records; they are, as noted by Nigrini, subject to ‘extensive third-party reporting’. In other words, the IRS could check if people are telling the truth. This plot shows a near-perfect match to the Benford’s Law distribution.

The amount of money earned from dividends is not as easy for the IRS to check but it is still subject to some ‘less stringent’ third-party reporting, and the distribution as a whole deviates only slightly from Benford’s Distribution. So maybe there was a small amount of fudging going on. There are large spikes at 00 and 50 (and smaller spikes at other multiples of 10), which implies that some people are estimating their dividend income instead of reporting it exactly.

In 1978 people were trusted to add up all their interest paid on mortgages, credit cards, and so on, and to report it with little to no additional checks. There are smaller spikes at 00 and 50, which shows that people are more reluctant to look like they have estimated these values. This plot also shows the greatest divergence from the expected Benford’s Distribution. That does not necessarily mean fraud; it merely implies that the data has been influenced somehow. In this case, much of the deviation seems to be because people with small interest expenses did not bother reporting them.

I cannot say with certainty what distribution tests modern tax agencies use, but I can pretty much guarantee they do things like this and then take a closer look at anything which deviates from the expected. So if you are going to defraud the state on your tax return, you need to make sure you can generate the right type of random numbers. I’m just hoping that Her Majesty’s Revenue and Customs does not look for tax returns which exactly match Benford’s Law (like, suspiciously closely) to spot mathematicians precisely fudging their numbers …

Like, so totally random

So it turns out that true randomness is more predictable than people expect. And if even tax collectors know how to spot fake random data, can pseudorandom numbers ever be indistinguishable from the real thing? Thankfully, done carefully, a pseudorandom sequence can have almost all the properties which random numbers are expected to exhibit.

Forget funny distributions. As a source of randomness, pseudorandom numbers should be completely uniform and independent. This is the bland building block of randomness, and users can flavour those random numbers into whatever bespoke distribution they may require.

There are only two golden rules for plain-style random data:

All outcomes are equally likely.

Each event has no effect on what comes next.

Potato.

When I was checking for fraudulently random homework I used only two tests: a frequency test to make sure heads and tails appeared about as often as each other and a runs test to check longer combinations of results. But these are only a starting point. There are loads of ways you can check if data conforms to my two golden rules. And there is no one definitive battery of tests you should use; over the years, people have come up with all sorts of interesting ways to check how random data is, and no one test works on its own.

My favourite is an ensemble of tests called the diehard package. Sadly, this does not involve throwing the numbers off the Nakatomi Tower or making them crawl through an air vent. But, in my experience, it does help if during the process you yell, ‘Yippee ki-yay number-function!’ The diehard package is actually a collection of twelve separate tests.

Some of the tests are pretty boring, like checking for increasing and decreasing runs of digits. So, for the base-10 random number 0.5772156649, there is an increasing sequence 5–7–7 and then the decreasing sequence 7–7–2–1. These runs in different directions should be of an expected range of lengths. Or there’s the bitstream test, which converts the numbers into binary and looks at overlapping groups of twenty digits, checking which of all the 1,048,576 possible twenty-digit binary numbers are in each 2,097,171-digit block. Truly random binary data should be missing around 141,909 of them (with a standard deviation of 428).

Then there are the fun tests. These use the data in a strange situation to see if it works as expected. The parking-lot test uses the purportedly random sequence to place circular cars in a 100 by 100-square-metre parking lot. After 12,000 cars have tried to park at random, there should be 3,523 crashes (standard deviation of 21.9). Another test puts different-sized spheres inside a cube, and one test is simply to make the data play two hundred thousand games of craps (an old-timey dice game) and check if the winning games follow expected distributions.

How exotic and weird these tests are is part of their appeal. Random data should be equally random in all situations. If random tests were predictable, then the pseudorandom-sequences algorithms would evolve to match those tests. But if a sequence might be checked to see what average score it gets after two hundred thousand rounds of 1994 Sega Megadrive game Shaq Fu, then it had better be truly random.

There is one catch-all definition of randomness and, even though it is a bit too esoteric to be useful, I love it for its simplicity. A random sequence can be defined as any sequence which is equal to or shorter than any description of it. The length of the description of a random sequence is called its Kolmogorov complexity, after Russian mathematician Andrey Kolmogorov, who suggested it in 1963. If you can write a short computer program to generate a sequence of numbers, then that sequence cannot be random. If the only way to communicate a sequence is to print it out in its entirety, then you’ve got some randomness on your hands. And printing random numbers out is sometimes the best option.

Let’s get physical

Before the computer age, lists of random num
bers had to be generated in advance and printed as books for people to buy. I say ‘before the computer age’, but when I was at school in the 1990s we still had books with tables of random numbers in them. Handheld calculators, and indeed handheld computers, have come a long way since then. But, for true randomness, the printed page is hard to beat.

You can still buy books of random numbers online. If you have not done so before, you must read the online reviews of books of random numbers. You’d think people would not have much to say about lists of random digits, but this vacuum brings out the creativity in people.

★★★★★ by Mark Pack

Don’t turn to the last page and spoil the book by finding out the ending. Make sure you read from page 1 and let the tension build.

★★★ By R. Rosini

While the printed version is good, I would have expected the publisher to have an audiobook version as well.

★★★★ By Roy

If you like this book, I highly recommend that you read it in the original binary. As with most translations, conversion from binary to decimal frequently causes a loss of information and, unfortunately, it’s the most significant digits that are lost in the conversion.

★★★★★ By vangelion

Still a better love story than ‘Twilight’.

What if you don’t have time to buy a book and absolutely need a random number? Well, you’re going to need a random object. Nothing beats physical actions like flipping a coin or rolling a dice for getting real random numbers, even in our modern, high-tech age. This is why I keep various dice on me at all times, including a sixty-sided dice in case I need to generate a random seed for a bitcoin address (which use base-58 numbers).

In the San Francisco office of Cloudflare, lava lamps are used as a physical randomness generator. This is back to internet security and SSL, but on a much bigger scale than Netscape: Cloudflare handles over a quarter of a quadrillion encryption requests per day. About 10 per cent of all web traffic relies on Cloudflare. This means they need a lot of cryptographic-quality random numbers.

To meet this demand, they have a camera pointed at a hundred lava lamps in their lobby. It takes a photo once a millisecond and converts the random noise in the image into a stream of random 1s and 0s. The colourful moving blobs of the lava lamps help add to their noise, but it is actually the tiny fluctuations in pixel values which are the heart of the randomness. Their office in London has a chaotic pendulum bouncing about, and the Singapore set-up uses a far less visually exciting radioactive source.

Unlike most of the crap in tech-company lobbies, these lava lamps serve a purpose.

When it comes down to it, though, you can’t beat the cost effectiveness of a coin. An engineering friend of mine was working on a record-breaking tall, skinny tower in 2016 and discovered that engineers are simply not random enough. One of the issues with an incredibly thin tower is that the wind can set it vibrating like a guitar string and, if the wind matches the frequency it likes to resonate at, it could tear the tower apart.

To stop this, they designed patches of wind baffles to be attached to the outside of the tower to break up the wind flow. But, very importantly, they had to be put on randomly. If they were too uniform, they would not break the wind up enough. How did the engineers make sure they were placed randomly? To choose which sections would be with or without baffles someone in the office flipped a coin.

THIRTEEN

Does Not Compute

In 1996 a group of scientists and engineers were poised to launch a group of four satellites to investigate the Earth’s magnetosphere. The project had involved a decade of planning, designing, testing and building. The process is slow because, once a spacecraft is in space, it is very difficult to do any repairs. You don’t want to make any mistakes. Everything needs to be triple-checked. Now called the Cluster mission, the finished satellites were loaded on to a European Space Agency (ESA) Ariane 5 rocket in June 1996, ready to be launched into orbit from the Guiana Space Centre in South America.

We will never know if those spacecraft would have functioned as intended, as within forty seconds of lift-off the Ariane had activated its self-destruct system and exploded in the sky. Parts of the rocket and its payload of spacecraft rained down on 12 square kilometres of mangrove swamp and savanna in French Guiana.

One of the principal investigators of the Cluster mission still works at the Mullard Space Science Laboratory (part of University College London), where my wife now works. After the disaster: parts of the spacecraft were recovered and shipped back to UCL and the investigators opened the box to see years of work now represented by twisted chunks of metal with bits of swamp still attached. They are now on display in the staff common room as a reminder to the next generation of space scientists that they are spending their careers on ideas which can disappear in a puff of second-stage booster.

Twisted metal and electronics representing a decade of hard work.

Thankfully, the European Space Agency decided to rebuild the Cluster mission and try again. The Cluster II satellites were successfully put into orbit in 2000 by a Russian rocket. Originally planned to last two years in space, they are now coming up on two decades of successful operation.

So what went wrong with the Ariane 5 rocket? In short, the onboard computer tried to copy a 64-bit number into a 16-bit space. Online reports are quick to blame the maths, but the computer code must have been written in such a way as to cause that to happen. Programming is just formalized mathematical thought and processes. I wanted to know what that number was, why it had been copied to a location in the memory which was too small and why that had brought down an entire rocket … so I downloaded and trudged through the investigation report issued by ESA’s Inquiry Board.

The original programmer (or team of programmers) of this code did their work brilliantly. They put together a perfectly working Inertial Reference System (SRI) so the rocket always knew exactly where it was and what it was doing. An SRI is basically an interpreter between the sensors tracking the rocket and the computer doing the driving. The SRI could be linked to several sensors around a rocket, take the raw data coming from those gyroscopes and accelerometers and convert it into meaningful information. The SRI was also linked to the main onboard computer and fed it all the details of which way the rocket was facing and how fast it was going.

During this translational work the SRI would be converting all sorts of data between different formats, which is the natural habitat of computerized maths error. The programmers identified seven cases where a floating-point value was coming in from a sensor and being converted into an integer. This is exactly the kind of situation where a longer number could accidentally be fed into a space that was too small, which would grind the program to a halt in an operand error.

To avoid this, it was possible to add a bit of extra code which looks at the incoming values and asks, ‘Is this going to cause an operand error if we try to convert it?’ Blanket use of this process could comprehensively safeguard against conversion errors. But making the program go through an extra check every time a conversion is about to happen is very processor-intensive, and the team had been given strict limits on how much processing power their code was allowed to use.

No problem, they thought: we’ll go back a step and look at the actual sensors that are sending the SRI the data and see what range of values they can possibly produce. For three of the inconvenient seven it was found that the input could never be big enough to cause an operand error, so protection was not added. The other four variables could possibly be too big, so they were always run through the safety check.

Which was all great … for the Ariane 4 rocket, the precursor of the Ariane 5. After years of faithful service, the SRI was pulled out of the Ariane 4 and used in the Ariane 5 without a proper check of the code. The Ariane 5 was designed with a different take-off trajectory to the Ariane 4, which involved greater horizontal speeds early in the launch. The trajectory of an Ariane 4 meant that this
horizontal velocity would never be big enough to cause a problem, so it was not checked. But on the Ariane 5 it quickly exceeded the space available for the value within the SRI and the system threw out an operand error. But this alone was not what brought the rocket down.

If everything went wrong with the flight of a rocket and things were clearly going to end in disaster, the SRI had been programmed to do a few admin tasks on the way out. Most importantly, it dumped all the data about what it was doing to a separate storage location. This would be vital data in any post-disaster investigation, so it was worth making sure it was saved. Like someone using their last breath to yell, ‘Tell my spouse I love them!’ – only it was a processor shouting, ‘Tell my debugger the following failure-context data!’

In the Ariane 4 system the data would be sent from the SRI to some external storage. Unfortunately, in the new Ariane 5 set-up, this ‘crash report’ was sent down the main connection from the SRI to the onboard computer. The new Ariane 5 onboard computer had never been warned that it might get a diagnostic report should the SRI have bit the bullet, so it assumed this was just more flight information and tried to read the data as if it were angles and velocities. It’s an oddly similar situation to when Pac-Man has an overflow error and tries to interpret the game data as if it were fruit data. Except now there are explosives involved.

‹ Prev Next ›