The Tiger That Isn't

Home > Other > The Tiger That Isn't > Page 10
The Tiger That Isn't Page 10

by Andrew Dilnot


  In other words, performance measurement would work best if it took into account the needs of the whole elephant, alongside the costs, something which advocacy groups, or the process of piecemeal standard setting, almost by definition, don't do.

  Healthcare involves choices. If we are to make them well, we need to recognise that the ideal standard of care for one condition might mean sacrificing standards for another. It is no good simply to say that we want the best of everything, unless we are prepared to bankrupt ourselves for it. In practice, we need to assess the claims of every treatment against the claims of every other. These claims will vary from patient to patient. How do we choose between them if we have already laid down rules in advance for how each one should be treated? The more we specify, the less we can choose.

  There is no simple answer to this. We do not advocate renouncing all rules or performance measures any more than we think that everything should be specified. But we do have to understand the risks, of which there are broadly two: either that people cheat, or that they do exactly what you ask, and then, when this turns out to be at the expense of everything else, you sort of wish they hadn't.

  If we trusted people to get on with the job as well as it could be done, none of this would arise, but we don't, sometimes with reason. So we intervene and try to change behaviour, hoping to steer it with numbers.

  Though the most conspicuous use of targets and performance indicators has been in healthcare, they are spreading. In the UK, they have arguably become the most trusted tool for managing the public sector. Trusted by the government, that is. For while the numbers have often shown improvement, not everyone believes them, and ministers struggle with the public perception that increased spending on public services has been poor value for money. Despite the hope that measurements and targets would ensure efficiency, many suspect that they have actually disguised waste.

  Examples beyond healthcare of the twisted logic attached to performance indicators include Britain's proud record of having the safest roads in Europe. We measure this by the number of accidents. But there are both elephant and gaming problems with our road-safety statistics. The first comes about because we define road safety by what happens on the roads. We take no account of the fact that many roads have become so fast, dual carriageways and the like, that pedestrians simply avoid them. Risk aversion is not the same as safety. So in some ways the roads might not be safer but more dangerous, even though casualties may be falling.

  But are casualties falling? Over the very long term they are, unquestionably. More recent evidence for adult casualties is less clear. The government has targeted road accidents, telling police forces they would be judged by their success in reducing the number killed and seriously injured on the roads. By 2010, it says, there should be 40 per cent fewer accidents overall than in the baseline period of 1994–8. As the target was introduced, the numbers began to fall and the government hailed a dramatic success.

  Then, in July 2006, the British Medical Journal reported an investigation of trends in accident statistics. This said that, according to the police, rates of people killed or seriously injured on the roads fell consistently from 85.9 accidents per 100,000 people in 1996 to 59.4 per 100,000 in 2004.

  But the police are not the only source for such statistics, and the BMJ authors' inspired idea was to check them against hospital records. There, recorded rates for traffic injuries were almost unchanged at 90.0 in 1996 and 91.1 in 2004. The authors concluded that the overall fall seen in police statistics for non-fatal road traffic injuries 'probably represents a fall in completeness of reporting of these injuries'.

  The police have some discretion about how they record an injury, and it is in this category of their statistics that the bulk of the improvement seems to have occurred. Deaths on the roads, where there is little scope for statistical discretion, have been largely flat in recent years in both police and hospital statistics. So, once targeted, it seems the police noted a fall in the one kind of accident over which they had some discretion in the reporting, but no other. And others didn't observe the same change. So did the accidents really go down, or did the police simply respond to the target by filling fewer notebooks?

  Here is one last example in a list long enough to suggest a generic problem. In response to concern about recycling rates in Britain lagging behind others in Europe, the government targeted them. Local councils responded with ingenuity and started collecting waste they had never tried to collect before, but that they could easily recycle. They called it, to give the whole enterprise a lick of environmental respectability, green waste. We do not really know what happened to this waste before: some was probably burnt, some thrown on the compost, some no doubt went in the black bin with the other rubbish. Now a big van came to collect it instead. Being heavy with water (waste is measured by weight), the vegetation did wonders for the recycling rate. There have even been stories of green waste being sprayed with water to make it even heavier. But is that really what people had in mind when they said there should be more recycling?

  If all this leads to the conclusion that measurement is futile, then it is a conclusion too far: whether your income is $1 a day or $100 can be measured reasonably well and it matters enormously. The key is to know the number's limitations: how much does it capture of what we really want to know? How much of the elephant does it show us? When is it wise to make this number everyone's objective? How will people behave if we do?

  This suggestion – that targets and other summary indicators need to be used with humility – implies another; that until they are, we must treat their results with care or even suspicion. After years of bruising experience, there are signs of such a change of approach. Britain's Healthcare Commission, responsible for monitoring performance in hospitals across England and Wales, no longer gives the impression of thinking that it can make fine distinctions between the quality of one hospital and another. Instead, it puts more of its energies into two strategies. The first is to set bottom-line standards of practice – as distinct from outcomes of performance – across all areas of healthcare that all are expected to achieve: are hospitals putting effort into cleaning the place properly, have staff been trained to report adverse incidents, and so on.

  The second, more interesting, is spotting the real problems, those whose performance is so out of line that it is unlikely to be a statistical artefact. Not the great bulk of hospitals and procedures that seem broadly all right or better – these are left largely alone – but those with a pattern of results that cause concern. This would be better described as surveillance, not performance measurement, or at least not as we have previously understood it. The Healthcare Commission calls its approach 'risk-based', which emphasises the difference between the conceit of performance measurement and the practical importance of spotting those at the margins who represent a potential danger. When Healthcare Commission inspectors think the numbers suggest a problem – and they are generally restrained in that judgement– they do not necessarily assume anything definitive, or go in with a big stick. So an investigation of 'apparently' high rates of mortality at Staffordshire NHS trust in 2008, particularly among emergency admissions, was duly cautious in airing its suspicions. Nigel Ellis, the Commission's head of investigations, said:

  An apparently high rate of mortality does not necessarily mean there are problems with safety. It may be there are other factors here such as the way that information about patients is recorded by the trust. Either way it does require us to ask questions, which is why we are carrying out this investigation.

  On top of this, they conduct a large number of spot checks, both random and targeted, to try to ensure that the data matches reality. If they find that hospitals have been falsely reporting standards, the hospitals are penalised. This is the first example we have seen where there are explicit costs attached to producing bad data. This seems to be working. Hospitals that concealed bad performance in the past but were found out – and paid for it – seem to turn into sa
intly confessors in succeeding years.

  But it is the data that really stands out that has a chance of telling us something useful, the performance that appears either awful (and worrying), or superb (and potentially instructive), not the mass in the middle that is exasperatingly hard to differentiate. There may still be boring explanations even for performance at the extremes, to do with the way the data is recorded, for example. But it is here, in our view, where the numbers offer good clues, if not definitive answers, here that anyone interested in improving performance should start work, never assuming that the numbers have done that work for them.

  The Healthcare Commission still does not routinely investigate possible areas of gaming, but there is now growing pressure to ensure the integrity of the data so this too might not be far off.

  In the past, there was an incentive not to bother. Both the target setter – the government – and target managers want the numbers to look good, and critics allege collusion between them. Some of the early measurements of waiting times, for example, were a snapshot during a short period that – dubiously – was announced well in advance, giving hospitals a nod and a wink to do whatever was necessary in that period, but not others, to score well. It was known that hospitals were diverting resources during the measured period to hit the target, before moving them back again.

  It is as well there is room for improvement, since Bevan and Hood say that targets and performance indicators are not easily dispensed with: alternatives like command and control from the centre are not much in favour either, nor is a free market in healthcare. If target setters were serious about the problem of gaming, an interesting approach would be to be a bit vague about the target, so that no one is quite sure how to duck it (will it be first appointments or follow ups?), introduce some randomness into the monitoring, and initiate more systematic monitoring of the integrity of the numbers.

  They conclude: 'Corrective action is needed to reduce the risk of the target regime being so undermined by gaming that it degenerates, as happened in the Soviet Union.'

  We haven't yet reached that point of ridicule, though it might not be far off. The Police Federation has complained (May 2007) that its members are spending increasing time prosecuting trivial crimes and neglecting more important duties simply in order to meet targets for arrest or summary fine. Recent reports include that of a boy in Manchester arrested under firearm laws for being in possession of a plastic pistol, a youth in Kent for throwing a slice of cucumber at another youngster, a man in Cheshire for being 'in possession of an egg with intent to throw', a student fined after 'insulting a police horse'. Perhaps the best (or worst) example is the child who, according to a delegate at a Police Federation conference, went around his neighbourhood collecting sponsorship money, then stole it. After a lengthy investigation, the police had to decide whether he had committed one offence or dozens. Since they knew the culprit and all the victims, they said 'dozens', as this did wonders for their detection rate. The absurdity of so much fatuous activity – all in the name of improving police performance – is that it will give the appearance of an increasing crime rate.

  Underlying many of the problems here is the simple fact that measurement is not passive; it often changes the very thing that we are measuring. And many of the measurements we hear every day, if strained too far, may have caricatured the world and so changed it in ways we never intended. Numbers are pure and true; counting never is. That limitation does not ruin counting by any means but, if you forget it, the world you think you know through numbers will be a neat, tidy illusion.

  7

  Risk: Bring Home the Bacon

  Numbers have amazing power to put life's anxieties into proportion: Will it be me? What happens if I do? What if I don't? They can't predict the future, but they can do something almost as impressive: taming chaos and turning it into probability. We actually have the ability to measure uncertainty.

  Yet this power is squandered through an often needless misalignment of the way people habitually think and the way risks and uncertainties are typically reported.

  The news says, 'Risk up 42 per cent', a solitary, abstract number. All you want to know is, 'Does that mean me?' There you are, wrestling with your fears and dilemmas, and the best you've got to go on is a percentage, typically going up, and generally no help whatsoever.

  Our thoughts about uncertainty are intensely personal, but the public and professional language can be absurdly abstract. No surprise, then, that when numbers are mixed with fear the result is often not the insight it could be, but confusion and fear out of proportion.

  It need not be like this. It is often easy to bring the numbers home, back into line with personal experience. Why that is not done is sometimes a shameful tale. When it is done, we often find that statements about risk that had appeared authoritative and somehow scientific were telling us nothing useful.

  The answer to anxiety about numbers around risk and uncertainty is, like other answers here, simple: be practical and human.

  Don't eat bacon. Just don't. That's not a 'cut down' or a 'limit your intake,' it's a 'no'. This is the advice of the World Cancer Research Fund: avoid processed meat. 'Avoid' means leave it alone, if at all possible.

  The WCRF says: 'Research on processed meat shows cancer risk starts to increase with any portion.' And it is right; this is what the research shows. A massive joint report in 2007 found that an extra ounce of bacon a day increased the risk of colorectal cancer by 21 per cent.

  You will sense that there is a 'but' coming. Let it wait, and savour for a while the authority of the report (even if you can no longer savour a bacon sandwich), in the words of one country's cancer research institute:

  [The] Expert Report involved thousands of studies and hundreds of experts from around the world. First, a task force established a uniform and scientific method to collect the relevant evidence. Next, independent research teams from universities and research centres around the world collected all relevant literature on seventeen different cancers, along with research on causes of obesity, cancer survivors and other reports on chronic diseases. In the final step, an independent panel of twenty-one world-renowned scientists assessed and evaluated the vast body of evidence.

  All this is true. As far as it is possible to discern the effect of one part of our diet, lifestyle and environmental exposure from all others, the evidence is not bad, and has been responsibly interpreted. So what's the 'but'?

  The 'but' is that nothing we have said so far gives you the single most essential piece of information, namely, what the risk actually is. We've told you how much it goes up, but not where it started, or where it finished. In the reporting of risk by the media and others, that absurd, witless practice is standard. Size matters to risk: big is often bad, small often isn't, that's the whole point of quantification. You want to know if eating the food on your plate is of the same magnitude as playing Russian roulette, crossing the road, or breathing. It makes a difference, obviously, to whether you eat it. But in far too many reports, the risk itself is ignored.

  Percentage changes depend entirely on where you start: double a risk of 1 in a million (risk up 100 per cent!) and it becomes 2 in a million; put an extra bullet in the revolver and the risk of Russian roulette also doubles. But all the newspaper tells you is what the risk has gone up by (100 per cent in both cases). By this standard, one risk is apparently no better or worse than the other.

  This – you might think conspicuous – oversight is strangely (and in our view scandalously) typical. 'Risk of X doubles for pregnant women.' 'Drinking raises risk of Y.' 'Cell-phone cancer risk up 50 per cent.' You'll be all too familiar with this type of headline, above reports that more often than not ignore the baseline risk.

  Let's do it properly. What is the baseline risk of colorectal cancer? There are two ways of describing it, an obscure one and an easy one. Skip this paragraph if you don't fancy obscurity. First, the way the World Cancer Research Fund does it. The incidence of colorectal cancer
in the United Kingdom at the moment is about 45 per 100,000 for men and about 40 per 100,000 for women. That's. A hundred pages later, we find the 21 per cent increase owing to bacon. None of this is intuitively easy to understand or conveniently presented. Media coverage, by and large, was even worse, usually ignoring the baseline risk altogether.

  Fortunately, there is another way. Those who neglect it, whether media, cancer charities, or anyone else, ought to have a good explanation, though we have yet to hear one. And we need endure neither the 'eat bacon and die' flavour of advice from some quarters, bland reassurance from others, nor the mire of percentage increases on rates per 100,000 from others.

  Here it is:

  About five men in a hundred typically get colorectal cancer in a lifetime. If they all ate an extra couple of slices of bacon every day, about six would.

  And that's it.

  All the information so mangled or ignored is there in two short sentences which, by counting people instead of abstract, relative percentages, are intuitively easy to grasp. We can see for ourselves that for 99 men in 100, an extra bit of bacon a day makes no difference to whether they get colorectal cancer, and we can decide with a clearer head whether to take the risk of being the one exception.

  'Save our Bacon: Butty Battle!' said the Sun newspaper. But it beat the serious newspapers for intelligible reporting of the risks, being one of very few to make it clear how many people could be affected.

  It should be easy. And yet …

  'For every alcoholic drink a woman consumes, her risk of breast cancer rises by 6 per cent.'

  That's pure garbage, by the way, despite its prominence on the BBC's national TV news bulletins on 12 November 2002, and it's soon obvious why: if true, every woman who drank regularly – and plenty who liked an occasional tipple – would be a certainty for breast cancer in time for Christmas. A 6 per cent increase with every glass soon adds up; about seven bottles of wine over a lifetime would make you a sure thing.

 

‹ Prev