Tim Harford The Undercover Economist

Other WritingOther Writing

Articles from the New York Times, Forbes, Wired and beyond – any piece that isn’t one of my columns.

Other Writing

Why pilot schemes help ideas take flight

There’s huge value in experiments that help us decide whether to go big or go home

Here’s a little puzzle. You’re offered the chance to participate in two high-risk business ventures. Each costs £11,000. Each will be worth £1m if all goes well. Each has just a 1 per cent chance of success. The mystery is that the ventures have very different expected pay-offs.

One of these opportunities is a poor investment: it costs £11,000 to get an expected payout of £10,000, which is 1 per cent of a million. Unless you take enormous pleasure in gambling, the venture makes no sense.

Strangely, the other opportunity, while still risky, is an excellent bet. With the same cost and the same chance of success, how could that be?

Here’s the subtle difference. This attractive alternative project has two stages. The first is a pilot, costing £1,000. The pilot has a 90 per cent chance of failing, which would end the whole project. If the pilot succeeds, scaling up will cost a further £10,000, and there will be a 10 per cent chance of a million-pound payday.

This two-stage structure changes everything. While the total cost is still £11,000 and the chance of success is still 1 per cent, the option to get out after a failed pilot is invaluable. Nine times out of 10, the pilot will save you from wasting £10,000 – which means that while the simple project offers an expected loss of £1,000, the two-stage project has an expected profit of £8,000.

In a real project, nobody could ever be sure about the probability of success or its rewards. But the idea behind this example is very real: there’s huge value in experiments that help us decide whether to go big or go home.

We can see this effect in data from the venture capital industry. One study looked at companies backed by US venture capitalists (VCs) between 1986 and 1997, comparing them with a sample of companies chosen randomly to be the same age, size and from the same industry. (These results were published in this summer’s Journal of Economic Perspectives in an article titled “Entrepreneurship as Experimentation”.)

By 2007, only a quarter of the VC-backed firms had survived, while one-third of the comparison group was still in business. However, the surviving VC-backed firms were big successes, employing more than five times as many people as the surviving comparison firms. We can’t tell from this data whether the VCs are creating winners or merely spotting them in advance but we can see that big successes on an aggregate scale are entwined with a very high failure rate.

The option to conduct a cheap test run can be very valuable. It’s easy to lose sight of quite how valuable. Aza Raskin, who was lead designer for the Firefox browser, cites the late Paul MacCready as his inspiration on this point. MacCready was one of the great aeronautical engineers, and his most famous achievement was to build the Gossamer Condor and the Gossamer Albatross, human-powered planes that tore up the record books in the late 1970s.

One of MacCready’s key ideas was to develop a plane that could swiftly be rebuilt after a crash. Each test flight revealed fresh information, MacCready figured, but human-powered planes are so feather-light that each test flight also damages the plane. The most important thing a designer could do was to build a plane that could be rebuilt within days or even hours after a crash – rather than weeks or months. Once the problem of fast, cheap experimentation was solved, everything else followed.

Some professions have internalised this lesson. Architects use scale models to shed light on how a completed building might look and feel. A nicely made model can take days of work to complete but that is not much compared with the cost of the building itself.

Politicians don’t find it so easy. A new policy is hardly a new policy at all unless it can be unveiled in a blaze of glory, preferably as a well-timed surprise. That hardly suits the MacCready approach. Imagine the conference speech: “We’re announcing a new array of quick-and-dirty experiments with the welfare state. We’ll be iterating rapidly after each new blunder and heart-rending tabloid anecdote.”

A subtler problem is that projects need a certain scale before powerful decision makers will take them seriously.

“The transaction costs involved in setting up any aid project are so great that most donors don’t want to consider a project spending less than £20m,” says Owen Barder, director for Europe at the Center for Global Development, a think-tank. I suspect that the same insight applies far beyond the aid industry. Governments and large corporations can find it’s such a hassle to get anything up and running that the big stakeholders don’t want to be bothered with anything small.

That is a shame. The real leverage of a pilot scheme is that although it is cheap, it could have much larger consequences. The experiment itself may seem too small to bother with; the lesson it teaches is not.

Also published at ft.com.

21st of October, 2014Other WritingComments off
Highlights

How to see into the future

Billions of dollars are spent on experts who claim they can forecast what’s around the corner, in business, finance and economics. Most of them get it wrong. Now a groundbreaking study has unlocked the secret: it IS possible to predict the future – and a new breed of ‘superforecasters’ knows how to do it

Irving Fisher was once the most famous economist in the world. Some would say he was the greatest economist who ever lived. “Anywhere from a decade to two generations ahead of his time,” opined the first Nobel laureate economist Ragnar Frisch, in the late 1940s, more than half a century after Fisher’s genius first lit up his subject. But while Fisher’s approach to economics is firmly embedded in the modern discipline, many of those who remember him now know just one thing about him: that two weeks before the great Wall Street crash of 1929, Fisher announced, “Stocks have reached what looks like a permanently high plateau.”

In the 1920s, Fisher had two great rivals. One was a British academic: John Maynard Keynes, a rising star and Fisher’s equal as an economic theorist and policy adviser. The other was a commercial competitor, an American like Fisher. Roger Babson was a serial entrepreneur with no serious academic credentials, inspired to sell economic forecasts by the banking crisis of 1907. As Babson and Fisher locked horns over the following quarter-century, they laid the foundations of the modern economic forecasting industry.

Fisher’s rivals fared better than he did. Babson foretold the crash and made a fortune, enough to endow the well-respected Babson College. Keynes was caught out by the crisis but recovered and became rich anyway. Fisher died in poverty, ruined by the failure of his forecasts.

If Fisher and Babson could see the modern forecasting industry, it would have astonished them in its scale, range and hyperactivity. In his acerbic book The Fortune Sellers, former consultant William Sherden reckoned in 1998 that forecasting was a $200bn industry – $300bn in today’s terms – and the bulk of the money was being made in business, economic and financial forecasting.

It is true that forecasting now seems ubiquitous. Data analysts forecast demand for new products, or the impact of a discount or special offer; scenario planners (I used to be one) produce broad-based narratives with the aim of provoking fresh thinking; nowcasters look at Twitter or Google to track epidemics, actual or metaphorical, in real time; intelligence agencies look for clues about where the next geopolitical crisis will emerge; and banks, finance ministries, consultants and international agencies release regular prophecies covering dozens, even hundreds, of macroeconomic variables.

Real breakthroughs have been achieved in certain areas, especially where rich datasets have become available – for example, weather forecasting, online retailing and supply-chain management. Yet when it comes to the headline-grabbing business of geopolitical or macroeconomic forecasting, it is not clear that we are any better at the fundamental task that the industry claims to fulfil – seeing into the future.

So why is forecasting so difficult – and is there hope for improvement? And why did Babson and Keynes prosper while Fisher suffered? What did they understand that Fisher, for all his prodigious talents, did not?

In 1987, a young Canadian-born psychologist, Philip Tetlock, planted a time bomb under the forecasting industry that would not explode for 18 years. Tetlock had been trying to figure out what, if anything, the social sciences could contribute to the fundamental problem of the day, which was preventing a nuclear apocalypse. He soon found himself frustrated: frustrated by the fact that the leading political scientists, Sovietologists, historians and policy wonks took such contradictory positions about the state of the cold war; frustrated by their refusal to change their minds in the face of contradictory evidence; and frustrated by the many ways in which even failed forecasts could be justified. “I was nearly right but fortunately it was Gorbachev rather than some neo-Stalinist who took over the reins.” “I made the right mistake: far more dangerous to underestimate the Soviet threat than overestimate it.” Or, of course, the get-out for all failed stock market forecasts, “Only my timing was wrong.”

Tetlock’s response was patient, painstaking and quietly brilliant. He began to collect forecasts from almost 300 experts, eventually accumulating 27,500. The main focus was on politics and geopolitics, with a selection of questions from other areas such as economics thrown in. Tetlock sought clearly defined questions, enabling him with the benefit of hindsight to pronounce each forecast right or wrong. Then Tetlock simply waited while the results rolled in – for 18 years.

Tetlock published his conclusions in 2005, in a subtle and scholarly book, Expert Political Judgment. He found that his experts were terrible forecasters. This was true in both the simple sense that the forecasts failed to materialise and in the deeper sense that the experts had little idea of how confident they should be in making forecasts in different contexts. It is easier to make forecasts about the territorial integrity of Canada than about the territorial integrity of Syria but, beyond the most obvious cases, the experts Tetlock consulted failed to distinguish the Canadas from the Syrias.

Adding to the appeal of this tale of expert hubris, Tetlock found that the most famous experts fared somewhat worse than those outside the media spotlight. Other than that, the humiliation was evenly distributed. Regardless of political ideology, profession and academic training, experts failed to see into the future.

Most people, hearing about Tetlock’s research, simply conclude that either the world is too complex to forecast, or that experts are too stupid to forecast it, or both. Tetlock himself refused to embrace cynicism so easily. He wanted to leave open the possibility that even for these intractable human questions of macroeconomics and geopolitics, a forecasting approach might exist that would bear fruit.

. . .

In 2013, on the auspicious date of April 1, I received an email from Tetlock inviting me to join what he described as “a major new research programme funded in part by Intelligence Advanced Research Projects Activity, an agency within the US intelligence community.”

The core of the programme, which had been running since 2011, was a collection of quantifiable forecasts much like Tetlock’s long-running study. The forecasts would be of economic and geopolitical events, “real and pressing matters of the sort that concern the intelligence community – whether Greece will default, whether there will be a military strike on Iran, etc”. These forecasts took the form of a tournament with thousands of contestants; it is now at the start of its fourth and final annual season.

“You would simply log on to a website,” Tetlock’s email continued, “give your best judgment about matters you may be following anyway, and update that judgment if and when you feel it should be. When time passes and forecasts are judged, you could compare your results with those of others.”

I elected not to participate but 20,000 others have embraced the idea. Some could reasonably be described as having some professional standing, with experience in intelligence analysis, think-tanks or academia. Others are pure amateurs. Tetlock and two other psychologists, Don Moore and Barbara Mellers, have been running experiments with the co-operation of this army of volunteers. (Mellers and Tetlock are married.) Some were given training in how to turn knowledge about the world into a probabilistic forecast; some were assembled into teams; some were given information about other forecasts while others operated in isolation. The entire exercise was given the name of the Good Judgment Project, and the aim was to find better ways to see into the future.

The early years of the forecasting tournament have, wrote Tetlock, “already yielded exciting results”.

A first insight is that even brief training works: a 20-minute course about how to put a probability on a forecast, correcting for well-known biases, provides lasting improvements to performance. This might seem extraordinary – and the benefits were surprisingly large – but even experienced geopolitical seers tend to have expertise in a subject, such as Europe’s economies or Chinese foreign policy, rather than training in the task of forecasting itself.

“For people with the right talents or the right tactics, it is possible to see into the future after all”

A second insight is that teamwork helps. When the project assembled the most successful forecasters into teams who were able to discuss and argue, they produced better predictions.

But ultimately one might expect the same basic finding as always: that forecasting events is basically impossible. Wrong. To connoisseurs of the frailties of futurology, the results of the Good Judgment Project are quite astonishing. Forecasting is possible, and some people – call them “superforecasters”– can predict geopolitical events with an accuracy far outstripping chance. The superforecasters have been able to sustain and even improve their performance.

The cynics were too hasty: for people with the right talents or the right tactics, it is possible to see into the future after all.

Roger Babson, Irving Fisher’s competitor, would always have claimed as much. A serial entrepreneur, Babson made his fortune selling economic forecasts alongside information about business conditions. In 1920, the Babson Statistical Organization had 12,000 subscribers and revenue of $1.35m – almost $16m in today’s money.

“After Babson, the forecaster was an instantly recognisable figure in American business,” writes Walter Friedman, the author of Fortune Tellers, a history of Babson, Fisher and other early economic forecasters. Babson certainly understood how to sell himself and his services. He advertised heavily and wrote prolifically. He gave a complimentary subscription to Thomas Edison, hoping for a celebrity endorsement. After contracting tuberculosis, Babson turned his management of the disease into an inspirational business story. He even employed stonecutters to carve inspirational slogans into large rocks in Massachusetts (the “Babson Boulders” are still there).

On September 5 1929, Babson made a speech at a business conference in Wellesley, Massachusetts. He predicted trouble: “Sooner or later a crash is coming which will take in the leading stocks and cause a decline of from 60 to 80 points in the Dow-Jones barometer.” This would have been a fall of around 20 per cent.

So famous had Babson become that his warning was briefly a self-fulfilling prophecy. When the news tickers of New York reported Babson’s comments at around 2pm, the markets erupted into what The New York Times described as “a storm of selling”. Shares lurched down by 3 per cent. This became known as the “Babson break”.

The next day, shares bounced back and Babson, for a few weeks, appeared ridiculous. On October 29, the great crash began, and within a fortnight the market had fallen almost 50 per cent. By then, Babson had an advertisement in the New York Times pointing out, reasonably, that “Babson clients were prepared”. Subway cars were decorated with the slogan, “Be Right with Babson”. For Babson, his forecasting triumph was a great opportunity to sell more subscriptions.

But his true skill was marketing, not forecasting. His key product, the “Babson chart”, looked scientific and was inspired by the discoveries of Isaac Newton, his idol. The Babson chart operated on the Newtonian assumption that any economic expansion would be matched by an equal and opposite contraction. But for all its apparent sophistication, the Babson chart offered a simple and usually contrarian message.

“Babson offered an up-arrow or a down-arrow. People loved that,” says Walter Friedman. Whether or not Babson’s forecasts were accurate was not a matter that seemed to concern many people. When he was right, he advertised the fact heavily. When he was wrong, few noticed. And Babson had indeed been wrong for many years during the long boom of the 1920s. People taking his advice would have missed out on lucrative opportunities to invest. That simply didn’t matter: his services were popular, and his most spectacularly successful prophecy was also his most famous.

Babson’s triumph suggests an important lesson: commercial success as a forecaster has little to do with whether you are any good at seeing into the future. No doubt it helped his case when his forecasts were correct but nobody gathered systematic information about how accurate he was. The Babson Statistical Organization compiled business and economic indicators that were, in all probability, of substantial value in their own right. Babson’s prognostications were the peacock’s plumage; their effect was simply to attract attention to the services his company provided.

. . .

When Barbara Mellers, Don Moore and Philip Tetlock established the Good Judgment Project, the basic principle was to collect specific predictions about the future and then check to see if they came true. That is not the world Roger Babson inhabited and neither does it describe the task of modern pundits.

When we talk about the future, we often aren’t talking about the future at all but about the problems of today. A newspaper columnist who offers a view on the future of North Korea, or the European Union, is trying to catch the eye, support an argument, or convey in a couple of sentences a worldview that would otherwise be impossibly unwieldy to explain. A talking head in a TV studio offers predictions by way of making conversation. A government analyst or corporate planner may be trying to justify earlier decisions, engaging in bureaucratic defensiveness. And many election forecasts are simple acts of cheerleading for one side or the other.

“Some people – call them ‘superforecasters’– can predict geopolitical events with an accuracy far outstripping chance”

Unlike the predictions collected by the Good Judgment Project, many forecasts are vague enough in their details to allow the mistaken seer off the hook. Even if it was possible to pronounce that a forecast had come true or not, only in a few hotly disputed cases would anybody bother to check.

All this suggests that among the various strategies employed by the superforecasters of the Good Judgment Project, the most basic explanation of their success is that they have the single uncompromised objective of seeing into the future – and this is rare. They receive continual feedback about the success and failure of every forecast, and there are no points for radicalism, originality, boldness, conventional pieties, contrarianism or wit. The project manager of the Good Judgment Project, Terry Murray, says simply, “The only thing that matters is the right answer.”

I asked Murray for her tips on how to be a good forecaster. Her reply was, “Keep score.”

. . .

An intriguing footnote to Philip Tetlock’s original humbling of the experts was that the forecasters who did best were what Tetlock calls “foxes” rather than “hedgehogs”. He used the term to refer to a particular style of thinking: broad rather than deep, intuitive rather than logical, self-critical rather than assured, and ad hoc rather than systematic. The “foxy” thinking style is now much in vogue. Nate Silver, the data journalist most famous for his successful forecasts of US elections, adopted the fox as the mascot of his website as a symbol of “a pluralistic approach”.

The trouble is that Tetlock’s original foxes weren’t actually very good at forecasting. They were merely less awful than the hedgehogs, who deployed a methodical, logical train of thought that proved useless for predicting world affairs. That world, apparently, is too complex for any single logical framework to encompass.

More recent research by the Good Judgment Project investigators leaves foxes and hedgehogs behind but develops this idea that personality matters. Barbara Mellers told me that the thinking style most associated with making better forecasts was something psychologists call “actively open-minded thinking”. A questionnaire to diagnose this trait invites people to rate their agreement or disagreement with statements such as, “Changing your mind is a sign of weakness.” The project found that successful forecasters aren’t afraid to change their minds, are happy to seek out conflicting views and are comfortable with the notion that fresh evidence might force them to abandon an old view of the world and embrace something new.

Which brings us to the strange, sad story of Irving Fisher and John Maynard Keynes. The two men had much in common: both giants in the field of economics; both best-selling authors; both, alas, enthusiastic and prominent eugenicists. Both had immense charisma as public speakers.

Fisher and Keynes also shared a fascination with financial markets, and a conviction that their expertise in macroeconomics and in economic statistics should lead to success as an investor. Both of them, ultimately, were wrong about this. The stock market crashes of 1929 – in September in the UK and late October in the US – caught each of them by surprise, and both lost heavily.

Yet Keynes is remembered today as a successful investor. This is not unreasonable. A study by David Chambers and Elroy Dimson, two financial economists, concluded that Keynes’s track record over a quarter century running the discretionary portfolio of King’s College Cambridge was excellent, outperforming market benchmarks by an average of six percentage points a year, an impressive margin.

This wasn’t because Keynes was a great economic forecaster. His original approach had been predicated on timing the business cycle, moving into and out of different investment classes depending on which way the economy itself was moving. This investment strategy was not a success, and after several years Keynes’s portfolio was almost 20 per cent behind the market as a whole.

The secret to Keynes’s eventual profits is that he changed his approach. He abandoned macroeconomic forecasting entirely. Instead, he sought out well-managed companies with strong dividend yields, and held on to them for the long term. This approach is now associated with Warren Buffett, who quotes Keynes’s investment maxims with approval. But the key insight is that the strategy does not require macroeconomic predictions. Keynes, the most influential macroeconomist in history, realised not only that such forecasts were beyond his skill but that they were unnecessary.

Irving Fisher’s mistake was not that his forecasts were any worse than Keynes’s but that he depended on them to be right, and they weren’t. Fisher’s investments were leveraged by the use of borrowed money. This magnified his gains during the boom, his confidence, and then his losses in the crash.

But there is more to Fisher’s undoing than leverage. His pre-crash gains were large enough that he could easily have cut his losses and lived comfortably. Instead, he was convinced the market would turn again. He made several comments about how the crash was “largely psychological”, or “panic”, and how recovery was imminent. It was not.

One of Fisher’s major investments was in Remington Rand – he was on the stationery company’s board after selling them his “Index Visible” invention, a type of Rolodex. The share price tells the story: $58 before the crash, $28 by 1930. Fisher topped up his investments – and the price soon dropped to $1.

Fisher became deeper and deeper in debt to the taxman and to his brokers. Towards the end of his life, he was a marginalised figure living alone in modest circumstances, an easy target for scam artists. Sylvia Nasar writes in Grand Pursuit, a history of economic thought, “His optimism, overconfidence and stubbornness betrayed him.”

. . .

So what is the secret of looking into the future? Initial results from the Good Judgment Project suggest the following approaches. First, some basic training in probabilistic reasoning helps to produce better forecasts. Second, teams of good forecasters produce better results than good forecasters working alone. Third, actively open-minded people prosper as forecasters.

But the Good Judgment Project also hints at why so many experts are such terrible forecasters. It’s not so much that they lack training, teamwork and open-mindedness – although some of these qualities are in shorter supply than others. It’s that most forecasters aren’t actually seriously and single-mindedly trying to see into the future. If they were, they’d keep score and try to improve their predictions based on past errors. They don’t.

“Successful forecasters aren’t afraid to change their minds and are comfortable with the notion that fresh evidence might mean abandoning an old view”

This is because our predictions are about the future only in the most superficial way. They are really advertisements, conversation pieces, declarations of tribal loyalty – or, as with Irving Fisher, statements of profound conviction about the logical structure of the world. As Roger Babson explained, not without sympathy, Fisher had failed because “he thinks the world is ruled by figures instead of feelings, or by theories instead of styles”.

Poor Fisher was trapped by his own logic, his unrelenting optimism and his repeated public declarations that stocks would recover. And he was bankrupted by an investment strategy in which he could not afford to be wrong.

Babson was perhaps wrong as often as he was right – nobody was keeping track closely enough to be sure either way – but that did not stop him making a fortune. And Keynes prospered when he moved to an investment strategy in which forecasts simply did not matter much.

Fisher once declared that “the sagacious businessman is constantly forecasting”. But Keynes famously wrote of long-term forecasts, “About these matters there is no scientific basis on which to form any calculable probability whatever. We simply do not know.”

Perhaps even more famous is a remark often attributed to Keynes. “When my information changes, I alter my conclusions. What do you do, sir?”

If only he had taught that lesson to Irving Fisher.

Also published at ft.com.

Other Writing

Why inflation remains best way to avoid stagnation

The prospect is that central banks will find themselves helpless, writes Tim Harford

People who were not born when the financial crisis began are now old enough to read about it. We have been able to distract ourselves with two Olympics, two World Cups and two US presidential elections. Yet no matter how stale our economic troubles feel, they manage to linger.

Given the severity of the crisis and the inadequacy of the policy response, it should be no surprise that recovery has been slow and anaemic: that is what economic history always suggested. Yet some economists are growing disheartened. The talk is of “secular stagnation” – a phrase which could mean two things, neither of them good.

One fear has been well-aired: that future growth possibilities will be limited by an ageing population or perhaps even technological stagnation.

The second meaning of secular stagnation is altogether stranger: it is that regardless of their potential for growth, modern economies may suffer from a persistent tendency to slip below that potential, sliding into stubborn recessions. The west’s lost decade of economic growth may be a taste of things to come.

This view was put forward most forcefully by Lawrence Summers, who was Treasury secretary under Bill Clinton and a senior adviser to President Barack Obama. It has been discussed at length in a collection of essays published last week by the Centre for Economic Policy Research. But what could it mean?

Normally, when an economy slips into recession, the standard response is to cut interest rates. This encourages us to spend, rather than save, giving the economy an immediate boost.

Things become more difficult if nominal interest rates are already low. Central banks have to employ radical tactics of uncertain effectiveness, such as quantitative easing. Governments could and should borrow and spend to support the economy. In practice they have proved politically gridlocked (in the US), institutionally hamstrung (in the EU) or ideologically blinkered (in the UK). There is not much reason to think the politics of fiscal stimulus would be very different in the future, so the zero-interest rate boundary is a problem.

The awful prospect of secular stagnation is that this is the new normal. Interest rates will be very low as a matter of course, and central banks will routinely find themselves nearly helpless.
“A cut in interest rates encourages us to spend, rather than save, giving the economy an immediate boost”

Before we startle ourselves at shadows, let us ask why Prof Summers might be right. Real interest rates – the rates paid after adjusting for inflation – have been falling. In the US, real rates averaged about 5 per cent in the 1980s, 2 per cent in the 1990s and 1 per cent in the Noughties. (Since Lehman Brothers failed they have been negative, but the long-term trend speaks more eloquently.) Real interest rates have also been declining in the EU for 20 years. The International Monetary Fund’s estimate of global real interest rates has been declining for 30 years.

This does not look good, so why is it happening? The background level of real interest rates is set not by central banks but by supply and demand. Low real rates suggest lots of people are trying to save, and particularly in safe assets, while few people are trying to borrow and invest. Only with rates at a very low level can enough borrowers be found to mop up all the savings.

If secular stagnation is a real risk, we need policies to address it. One approach is to try to change the forces of supply and demand to boost the demand for cash to invest, while stemming the supply of savings, and reducing the bias towards super-safe assets.

This looks tricky. Much policy has pushed in the opposite direction. Consider the austerity drive and long-term goals to reduce government debt burdens; this reduces the supply of safe assets and pushes down real rates. Or the tendency in the UK to push pension risk away from companies and the government, and towards individuals; this encourages extra saving, just in case. Or the way in which (understandably) regulators insist that banks and pension funds hold more safe assets; again, this increases the demand for safe assets and pushes down real interest rates. To reverse all these policies, sacrificing microeconomic particulars for a rather abstract macroeconomic hunch, looks like a hard sell.

There is a simple alternative, albeit one that carries risks. Central bank targets for inflation should be raised to 4 per cent. A credible higher inflation target would provide immediate stimulus (who wants to squirrel away money that is eroding at 4 per cent a year?) and would give central banks more leeway to cut real rates in future. If equilibrium real interest rates are zero, that might not matter when central banks can produce real rates of minus 4 per cent.

If all that makes you feel queasy, it should. As Prof Summers argues, unpleasant things have a tendency to happen when real interest rates are very low. Bubbles inflate, Ponzi schemes prosper and investors are reckless in their scrabble for yield.

One thing that need not worry anyone, though, is the prospect of an inflation target of 4 per cent. It will not happen. That is particularly true in the place where the world economy most needs more inflation: in the eurozone. The German folk memory of hyperinflation in 1923 is just too strong. That economic catastrophe, which helped lay the foundations for Nazism and ruin much of the 20th century, continues to resonate today.

What practical policy options remain? That is easy to see. We must cross our fingers and hope that Prof Summers is mistaken.

Also published at ft.com.

5th of September, 2014Other WritingComments off
Other Writing

Monopoly is a bureaucrat’s friend but a democrat’s foe

The challenges from smaller competitors spur the innovations that matter

“It takes a heap of Harberger triangles to fill an Okun gap,” wrote James Tobin in 1977, four years before winning the Nobel Prize in economics. He meant that the big issue in economics was not battling against monopolists but preventing recessions and promoting recovery.

After the misery of recent years, nobody can doubt that preventing recessions and promoting recovery would have been a very good idea. But economists should be able to think about more than one thing at once. What if monopoly matters, too?

The Harberger triangle is the loss to society as monopolists raise their prices, and it is named after Arnold Harberger, who 60 years ago discovered that the costs of monopoly were about 0.1 per cent of US gross domestic product – a few billion dollars these days, much less than expected and much less than a recession.

Professor Harberger’s discovery helped build a consensus that competition authorities could relax about the power of big business. But have we relaxed too much?

Large companies are all around us. We buy our mid-morning coffee from global brands such as Starbucks, use petrol from Exxon or Shell, listen to music purchased from a conglomerate such as Sony (via Apple’s iTunes), boot up a computer that runs Microsoft on an Intel processor. Crucial utilities – water, power, heating, internet and telephone – are supplied by a few dominant groups, with baffling contracts damping any competition.

Of course, not all large businesses have monopoly power. Tesco, the monarch of British food retailing, has found discount competitors chopping up its throne to use as kindling. Apple and Google are supplanting Microsoft. And even where market power is real, Prof Harberger’s point was that it may matter less than we think. But his influential analysis focused on monopoly pricing. We now know there are many other ways in which dominant businesses can harm us.

In 1989 the Beer Orders shook up a British pub industry controlled by six brewers. The hope was that more competition would lead to more and cheaper beer. It did not. The price of beer rose. Yet so did the quality of pubs. Where once every pub had offered rubbery sandwiches and stinking urinals, suddenly there were sports bars, candlelit gastropubs and other options. There is more to competition than lower prices.

Monopolists can sometimes use their scale and cash flow to produce real innovations – the glory years of Bell Labs come to mind. But the ferocious cut and thrust of smaller competitors seems a more reliable way to produce many of the everyday innovations that matter.

That cut and thrust is no longer so cutting or thrusting as once it was. “The business sector of the US economy is ageing,” says a Brookings research paper. It is a trend found across regions and industries, as incumbent players enjoy entrenched advantages. “The rate of business start-ups and the pace of employment dynamism in the US economy has fallen over recent decades . . . This downward trend accelerated after 2000,” adds a survey in the Journal of Economic Perspectives.

That means higher prices and less innovation, but perhaps the game is broader still. The continuing debate in the US over “net neutrality” is really an argument about the least damaging way to regulate the conduct of cable companies that hold local monopolies. If customers had real choice over their internet service provider, net neutrality rules would be needed only as a backstop.

As the debate reminds us, large companies enjoy power as lobbyists. When they are monopolists, the incentive to lobby increases because the gains from convenient new rules and laws accrue solely to them. Monopolies are no friend of a healthy democracy.

They are, alas, often the friend of government bureaucracies. This is not just a case of corruption but also about what is convenient and comprehensible to a politician or civil servant. If they want something done about climate change, they have a chat with the oil companies. Obesity is a problem to be discussed with the likes of McDonald’s. If anything on the internet makes a politician feel sad, from alleged copyright infringement to “the right to be forgotten”, there is now a one-stop shop to sort it all out: Google.

Politicians feel this is a sensible, almost convivial, way to do business – but neither the problems in question nor the goal of vigorous competition are resolved as a result.

One has only to consider the way the financial crisis has played out. The emergency response involved propping up big institutions and ramming through mergers; hardly a long-term solution to the problem of “too big to fail”. Even if smaller banks do not guarantee a more stable financial system, entrepreneurs and consumers would profit from more pluralistic competition for their business.

No policy can guarantee innovation, financial stability, sharper focus on social problems, healthier democracies, higher quality and lower prices. But assertive competition policy would improve our odds, whether through helping consumers to make empowered choices, splitting up large corporations or blocking megamergers. Such structural approaches are more effective than looking over the shoulders of giant corporations and nagging them; they should be a trusted tool of government rather than a last resort.

As human freedoms go, the freedom to take your custom elsewhere is not a grand or noble one – but neither is it one that we should abandon without a fight.

Also published at ft.com.

16th of August, 2014Other WritingComments off
Other Writing

Pity the robot drivers snarled in a human moral maze

Robotic cars do not get tired, drunk or angry but there are bound to be hiccups, says Tim Harford

Last Wednesday Vince Cable, the UK business secretary, invited British cities to express their interest in being used as testing grounds for driverless cars. The hope is that the UK will gain an edge in this promising new industry. (German autonomous cars were being tested on German, French and Danish public roads 20 years ago, so the time is surely ripe for the UK to leap into a position of technological leadership.)

On Tuesday, a very different motoring story was in the news. Mark Slater, a lorry driver, was convicted of murdering Trevor Allen. He had lost his temper and deliberately driven a 17 tonne lorry over Mr Allen’s head. It is a striking juxtaposition.

The idea of cars that drive themselves is unsettling, but with drivers like Slater at large, the age of the driverless car cannot come quickly enough.

But the question of how safe robotic cars are, or might become, is rather different from the question of the risks of a computer-guided car are perceived, and how they might be repackaged by regulators, insurers and the courts.

On the first question, it is highly likely that a computer will one day do a better, safer, more courteous job of driving than you can. It is too early to be certain of that, because serious accidents are rare. An early benchmark for Google’s famous driverless car programme was to complete 100,000 miles driving on public roads – but American drivers in general only kill someone every 100m miles.

Still, the safety record so far seems good, and computers have some obvious advantages. They do not get tired, drunk or angry. They are absurdly patient in the face of wobbly cyclists, learner drivers and road hogs.

But there are bound to be hiccups. While researching this article my Google browser froze up while trying to read a Google blog post hosted on a Google blogging platform. Two seconds later the problem had been solved, but at 60 miles per hour two seconds is more than 50 metres. One hopes that Google-driven cars will be more reliable when it comes to the more literal type of crash.

Yet the exponential progress of cheaper, faster computers with deeper databases of experience will probably guarantee success eventually. In a simpler world, that would be the end of it.

Reality is knottier. When a car knocks over a pedestrian, who is to blame? Our answer depends not only on particular circumstances but on social norms. In the US in the 1920s, the booming car industry found itself under pressure as pedestrian deaths mounted. One response was to popularise the word “jaywalking” as a term of ridicule for bumpkins who had no idea how to cross a street. Social norms changed, laws followed, and soon enough the default assumption was that pedestrians had no business being in the road. If they were killed they had only themselves to blame.

We should prepare ourselves for a similar battle over robot drivers. Assume that driverless cars are provably safer. When a human driver collides with a robo-car, where will our knee-jerk sympathies lie? Will we blame the robot for not spotting the human idiosyncrasies? Or the person for being so arrogant as to think he could drive without an autopilot?

When such questions arrive in the courts, as they surely will, robotic cars have a serious handicap. When they err, the error can be tracked back to a deep-pocketed manufacturer. It is quite conceivable that Google, Mercedes or Volvo might produce a robo-car that could avoid 90 per cent of the accidents that would befall a human driver, and yet be bankrupted by the legal cases arising from the 10 per cent that remained. The sensible benchmark for robo-drivers would be “better than human”, but the courts may punish them for being less than perfect.

There are deep waters here. How much space is enough when overtaking a slow vehicle – and is it legitimate for the answer to change when running late? When a child chases a ball out into the road, is it better to swerve into the path of an oncoming car, or on to the pavement where the child’s parents are standing, or not to swerve at all?

These are hardly thought of as ethical questions because human drivers make them intuitively and in an instant. But a computer’s priorities must be guided by its programmers, who have plenty of time to weigh up the tough ethical choices.

In 1967 Philippa Foot, one of Oxford’s great moral philosophers, posed a thought experiment that she called the “trolley problem”. A runaway railway trolley is about to kill five people, but by flipping the points, you can redirect it down a line where it will instead kill one. Which is the right course of action? It is a rich seam for ethical discourse, with many interesting variants. But surely Foot did not imagine that the trolley problem would have to be answered one way or another and wired into the priorities of computer chauffeurs – or that lawyers would second-guess those priorities in court in the wake of an accident.

Then there is the question of who opts for a driverless car. Sir David Spiegelhalter, a risk expert at Cambridge university, points out that most drivers are extremely safe. Most accidents are caused by a few idiots, and it is precisely those idiots, Sir David speculates, who are least likely to cede control to a computer.

Perhaps driverless cars will be held back by a tangle of social, legal and regulatory stubbornness. Or perhaps human drivers will one day be banned, or prohibitively expensive to insure. It is anyone’s guess, because while driving is no longer the sole preserve of meatsacks such as you and me, the question of what we fear and why we fear it remains profoundly, quirkily human.

Also published at ft.com.

7th of August, 2014Other WritingComments off
Other Writing

Gary Becker, 1930-2014

Gary Becker passed away on Saturday. My obituary for the Financial Times is below.

Gary Becker, the man who led the movement to apply economic ideas to areas of life such as marriage, discrimination and crime, died on May 3 after a long illness. He was 83.
Born in a coal-mining town in Pennsylvania, raised in Brooklyn and with a mathematics degree summa cum laude from Princeton, it was not until Becker arrived at the University of Chicago that he realised “I had to begin to learn again what economics is all about”.
He had considered taking up sociology, but found it “too difficult”. Yet he was to return to the questions of sociology again and again over the years, taking pleasure in wielding the rigorous yet reductive mathematical tools of economics. This approach was to win him the Nobel Memorial Prize in Economics in 1992, and make him one of the most influential and most cited economists of the 20th century.
His doctoral dissertation was on the economics of discrimination – how to measure it and what effects it might have. Becker showed that discrimination was costly for the bigot as well as the victim. This seemed strange material for an economist, and Becker attracted little attention for his ideas when he published a book on discrimination in 1957.
This didn’t seem to worry him. “My whole philosophy has been to be conventional in things such as dress and so on,” he told me in 2005. “But when it comes to ideas, I’ll be willing to stick my neck out: I can take criticism if I think I’m right.”
He received plenty of that criticism over the years for daring to develop economic theories of crime and punishment, of the demand for children, and of rational addicts who may quit in response to a credible threat to raise the price of cigarettes. His idea that individuals might think of their education as an investment, with a rate of return, caused outrage. Yet nobody now frets about the use of the phrase “human capital”, the title of one of Becker’s books.
That exemplifies the way that Becker’s approach has changed the way that economists think about what they do, often without explicitly recognising his influence. He was economically omnivorous: colleagues such as Lars Peter Hansen, a fellow Nobel laureate, would find Becker quizzing them and providing penetrating comments even on research that seemed far removed from Becker’s main interests.
“He will be remembered as a person who in a very creative way broadened the scope of economic analysis,” said Professor Hansen, “And as one of the very best economists of the 20th century.”
Becker’s life-long affection was for the subject he transformed. On weekend afternoons, he would often be found in his office, writing or answering questions from young academics six decades his junior. He continued to write a blog with the legal scholar Richard Posner until a few weeks before his death.
“He loved economics,” said Kevin Murphy, who taught a course alongside Becker for many years, “and he inspired so many economists.” Perhaps the most likely result of a class with Becker was not mastering a particular formal technique, but acquiring that distinctive economist’s outlook on the world.
That worldview was on display when on the way to his Lunch with the FT, Gary Becker parked illegally. On cross-examination, he cheerfully told me that after weighing the risks and benefits, this was a rational crime.
“That sounds like Gary to me,” said Prof Murphy. “He decided to give you a practical lesson in economics.”
Becker was widowed in 1970, and remarried in 1980 to a Chicago history professor, Guity Nashat. She survives him, as does a daughter, Catherine Becker; a sister, Natalie Becker; a stepson and two grandsons.

You can read my lunch with Gary Becker, or read more about his ideas in The Logic of Life. It was clear, speaking to his colleagues, that he will be greatly missed.

5th of May, 2014Other WritingComments off
Highlights

Big Data: Are we making a big mistake?

Five years ago, a team of researchers from Google announced a remarkable achievement in one of the world’s top scientific journals, Nature. Without needing the results of a single medical check-up, they were nevertheless able to track the spread of influenza across the US. What’s more, they could do it more quickly than the Centers for Disease Control and Prevention (CDC). Google’s tracking had only a day’s delay, compared with the week or more it took for the CDC to assemble a picture based on reports from doctors’ surgeries. Google was faster because it was tracking the outbreak by finding a correlation between what people searched for online and whether they had flu symptoms.

Not only was “Google Flu Trends” quick, accurate and cheap, it was theory-free. Google’s engineers didn’t bother to develop a hypothesis about what search terms – “flu symptoms” or “pharmacies near me” – might be correlated with the spread of the disease itself. The Google team just took their top 50 million search terms and let the algorithms do the work.
The success of Google Flu Trends became emblematic of the hot new trend in business, technology and science: “Big Data”. What, excited journalists asked, can science learn from Google?
As with so many buzzwords, “big data” is a vague term, often thrown around by people with something to sell. Some emphasise the sheer scale of the data sets that now exist – the Large Hadron Collider’s computers, for example, store 15 petabytes a year of data, equivalent to about 15,000 years’ worth of your favourite music.
But the “big data” that interests many companies is what we might call “found data”, the digital exhaust of web searches, credit card payments and mobiles pinging the nearest phone mast. Google Flu Trends was built on found data and it’s this sort of data that ­interests me here. Such data sets can be even bigger than the LHC data – Facebook’s is – but just as noteworthy is the fact that they are cheap to collect relative to their size, they are a messy collage of datapoints collected for disparate purposes and they can be updated in real time. As our communication, leisure and commerce have moved to the internet and the internet has moved into our phones, our cars and even our glasses, life can be recorded and quantified in a way that would have been hard to imagine just a decade ago.
Cheerleaders for big data have made four exciting claims, each one reflected in the success of Google Flu Trends: that data analysis produces uncannily accurate results; that every single data point can be captured, making old statistical sampling techniques obsolete; that it is passé to fret about what causes what, because statistical correlation tells us what we need to know; and that scientific or statistical models aren’t needed because, to quote “The End of Theory”, a provocative essay published in Wired in 2008, “with enough data, the numbers speak for themselves”.
Unfortunately, these four articles of faith are at best optimistic oversimplifications. At worst, according to David Spiegelhalter, Winton Professor of the Public Understanding of Risk at Cambridge university, they can be “complete bollocks. Absolute nonsense.”
Found data underpin the new internet economy as companies such as Google, Facebook and Amazon seek new ways to understand our lives through our data exhaust. Since Edward Snowden’s leaks about the scale and scope of US electronic surveillance it has become apparent that security services are just as fascinated with what they might learn from our data exhaust, too.
Consultants urge the data-naive to wise up to the potential of big data. A recent report from the McKinsey Global Institute reckoned that the US healthcare system could save $300bn a year – $1,000 per American – through better integration and analysis of the data produced by everything from clinical trials to health insurance transactions to smart running shoes.
But while big data promise much to scientists, entrepreneurs and governments, they are doomed to disappoint us if we ignore some very familiar statistical lessons.
“There are a lot of small data problems that occur in big data,” says Spiegelhalter. “They don’t disappear because you’ve got lots of the stuff. They get worse.”
. . .
Four years after the original Nature paper was published, Nature News had sad tidings to convey: the latest flu outbreak had claimed an unexpected victim: Google Flu Trends. After reliably providing a swift and accurate account of flu outbreaks for several winters, the theory-free, data-rich model had lost its nose for where flu was going. Google’s model pointed to a severe outbreak but when the slow-and-steady data from the CDC arrived, they showed that Google’s estimates of the spread of flu-like illnesses were overstated by almost a factor of two.
The problem was that Google did not know – could not begin to know – what linked the search terms with the spread of flu. Google’s engineers weren’t trying to figure out what caused what. They were merely finding statistical patterns in the data. They cared about ­correlation rather than causation. This is common in big data analysis. Figuring out what causes what is hard (impossible, some say). Figuring out what is correlated with what is much cheaper and easier. That is why, according to Viktor Mayer-Schönberger and Kenneth Cukier’s book, Big Data, “causality won’t be discarded, but it is being knocked off its pedestal as the primary fountain of meaning”.
But a theory-free analysis of mere correlations is inevitably fragile. If you have no idea what is behind a correlation, you have no idea what might cause that correlation to break down. One explanation of the Flu Trends failure is that the news was full of scary stories about flu in December 2012 and that these stories provoked internet searches by people who were healthy. Another possible explanation is that Google’s own search algorithm moved the goalposts when it began automatically suggesting diagnoses when people entered medical symptoms.
Google Flu Trends will bounce back, recalibrated with fresh data – and rightly so. There are many reasons to be excited about the broader opportunities offered to us by the ease with which we can gather and analyse vast data sets. But unless we learn the lessons of this episode, we will find ourselves repeating it.
Statisticians have spent the past 200 years figuring out what traps lie in wait when we try to understand the world through data. The data are bigger, faster and cheaper these days – but we must not pretend that the traps have all been made safe. They have not.
. . .
In 1936, the Republican Alfred Landon stood for election against President Franklin Delano Roosevelt. The respected magazine, The Literary Digest, shouldered the responsibility of forecasting the result. It conducted a postal opinion poll of astonishing ambition, with the aim of reaching 10 million people, a quarter of the electorate. The deluge of mailed-in replies can hardly be imagined but the Digest seemed to be relishing the scale of the task. In late August it reported, “Next week, the first answers from these ten million will begin the incoming tide of marked ballots, to be triple-checked, verified, five-times cross-classified and totalled.”
After tabulating an astonishing 2.4 million returns as they flowed in over two months, The Literary Digest announced its conclusions: Landon would win by a convincing 55 per cent to 41 per cent, with a few voters favouring a third candidate.
The election delivered a very different result: Roosevelt crushed Landon by 61 per cent to 37 per cent. To add to The Literary Digest’s agony, a far smaller survey conducted by the opinion poll pioneer George Gallup came much closer to the final vote, forecasting a comfortable victory for Roosevelt. Mr Gallup understood something that The Literary Digest did not. When it comes to data, size isn’t everything.
Opinion polls are based on samples of the voting population at large. This means that opinion pollsters need to deal with two issues: sample error and sample bias.
Sample error reflects the risk that, purely by chance, a randomly chosen sample of opinions does not reflect the true views of the population. The “margin of error” reported in opinion polls reflects this risk and the larger the sample, the smaller the margin of error. A thousand interviews is a large enough sample for many purposes and Mr Gallup is reported to have conducted 3,000 interviews.
But if 3,000 interviews were good, why weren’t 2.4 million far better? The answer is that sampling error has a far more dangerous friend: sampling bias. Sampling error is when a randomly chosen sample doesn’t reflect the underlying population purely by chance; sampling bias is when the sample isn’t randomly chosen at all. George Gallup took pains to find an unbiased sample because he knew that was far more important than finding a big one.
The Literary Digest, in its quest for a bigger data set, fumbled the question of a biased sample. It mailed out forms to people on a list it had compiled from automobile registrations and telephone directories – a sample that, at least in 1936, was disproportionately prosperous. To compound the problem, Landon supporters turned out to be more likely to mail back their answers. The combination of those two biases was enough to doom The Literary Digest’s poll. For each person George Gallup’s pollsters interviewed, The Literary Digest received 800 responses. All that gave them for their pains was a very precise estimate of the wrong answer.
The big data craze threatens to be The Literary Digest all over again. Because found data sets are so messy, it can be hard to figure out what biases lurk inside them – and because they are so large, some analysts seem to have decided the sampling problem isn’t worth worrying about. It is.
Professor Viktor Mayer-Schönberger of Oxford’s Internet Institute, co-author of Big Data, told me that his favoured definition of a big data set is one where “N = All” – where we no longer have to sample, but we have the entire background population. Returning officers do not estimate an election result with a representative tally: they count the votes – all the votes. And when “N = All” there is indeed no issue of sampling bias because the sample includes everyone.
But is “N = All” really a good description of most of the found data sets we are considering? Probably not. “I would challenge the notion that one could ever have all the data,” says Patrick Wolfe, a computer scientist and professor of statistics at University College London.
An example is Twitter. It is in principle possible to record and analyse every message on Twitter and use it to draw conclusions about the public mood. (In practice, most researchers use a subset of that vast “fire hose” of data.) But while we can look at all the tweets, Twitter users are not representative of the population as a whole. (According to the Pew Research Internet Project, in 2013, US-based Twitter users were disproportionately young, urban or suburban, and black.)
There must always be a question about who and what is missing, especially with a messy pile of found data. Kaiser Fung, a data analyst and author of Numbersense, warns against simply assuming we have everything that matters. “N = All is often an assumption rather than a fact about the data,” he says.
Consider Boston’s Street Bump smartphone app, which uses a phone’s accelerometer to detect potholes without the need for city workers to patrol the streets. As citizens of Boston download the app and drive around, their phones automatically notify City Hall of the need to repair the road surface. Solving the technical challenges involved has produced, rather beautifully, an informative data exhaust that addresses a problem in a way that would have been inconceivable a few years ago. The City of Boston proudly proclaims that the “data provides the City with real-time in­formation it uses to fix problems and plan long term investments.”
Yet what Street Bump really produces, left to its own devices, is a map of potholes that systematically favours young, affluent areas where more people own smartphones. Street Bump offers us “N = All” in the sense that every bump from every enabled phone can be recorded. That is not the same thing as recording every pothole. As Microsoft researcher Kate Crawford points out, found data contain systematic biases and it takes careful thought to spot and correct for those biases. Big data sets can seem comprehensive but the “N = All” is often a seductive illusion.
. . .
Who cares about causation or sampling bias, though, when there is money to be made? Corporations around the world must be salivating as they contemplate the uncanny success of the US discount department store Target, as famously reported by Charles Duhigg in The New York Times in 2012. Duhigg explained that Target has collected so much data on its customers, and is so skilled at analysing that data, that its insight into consumers can seem like magic.
Duhigg’s killer anecdote was of the man who stormed into a Target near Minneapolis and complained to the manager that the company was sending coupons for baby clothes and maternity wear to his teenage daughter. The manager apologised profusely and later called to apologise again – only to be told that the teenager was indeed pregnant. Her father hadn’t realised. Target, after analysing her purchases of unscented wipes and magnesium supplements, had.
Statistical sorcery? There is a more mundane explanation.
“There’s a huge false positive issue,” says Kaiser Fung, who has spent years developing similar approaches for retailers and advertisers. What Fung means is that we didn’t get to hear the countless stories about all the women who received coupons for babywear but who weren’t pregnant.
Hearing the anecdote, it’s easy to assume that Target’s algorithms are infallible – that everybody receiving coupons for onesies and wet wipes is pregnant. This is vanishingly unlikely. Indeed, it could be that pregnant women receive such offers merely because everybody on Target’s mailing list receives such offers. We should not buy the idea that Target employs mind-readers before considering how many misses attend each hit.
In Charles Duhigg’s account, Target mixes in random offers, such as coupons for wine glasses, because pregnant customers would feel spooked if they realised how intimately the company’s computers understood them.
Fung has another explanation: Target mixes up its offers not because it would be weird to send an all-baby coupon-book to a woman who was pregnant but because the company knows that many of those coupon books will be sent to women who aren’t pregnant after all.
None of this suggests that such data analysis is worthless: it may be highly profitable. Even a modest increase in the accuracy of targeted special offers would be a prize worth winning. But profitability should not be conflated with omniscience.
. . .
In 2005, John Ioannidis, an epidemiologist, published a research paper with the self-explanatory title, “Why Most Published Research Findings Are False”. The paper became famous as a provocative diagnosis of a serious issue. One of the key ideas behind Ioannidis’s work is what statisticians call the “multiple-comparisons problem”.
It is routine, when examining a pattern in data, to ask whether such a pattern might have emerged by chance. If it is unlikely that the observed pattern could have emerged at random, we call that pattern “statistically significant”.
The multiple-comparisons problem arises when a researcher looks at many possible patterns. Consider a randomised trial in which vitamins are given to some primary schoolchildren and placebos are given to others. Do the vitamins work? That all depends on what we mean by “work”. The researchers could look at the children’s height, weight, prevalence of tooth decay, classroom behaviour, test scores, even (after waiting) prison record or earnings at the age of 25. Then there are combinations to check: do the vitamins have an effect on the poorer kids, the richer kids, the boys, the girls? Test enough different correlations and fluke results will drown out the real discoveries.
There are various ways to deal with this but the problem is more serious in large data sets, because there are vastly more possible comparisons than there are data points to compare. Without careful analysis, the ratio of genuine patterns to spurious patterns – of signal to noise – quickly tends to zero.
Worse still, one of the antidotes to the ­multiple-comparisons problem is transparency, allowing other researchers to figure out how many hypotheses were tested and how many contrary results are languishing in desk drawers because they just didn’t seem interesting enough to publish. Yet found data sets are rarely transparent. Amazon and Google, Facebook and Twitter, Target and Tesco – these companies aren’t about to share their data with you or anyone else.
New, large, cheap data sets and powerful ­analytical tools will pay dividends – nobody doubts that. And there are a few cases in which analysis of very large data sets has worked miracles. David Spiegelhalter of Cambridge points to Google Translate, which operates by statistically analysing hundreds of millions of documents that have been translated by humans and looking for patterns it can copy. This is an example of what computer scientists call “machine learning”, and it can deliver astonishing results with no preprogrammed grammatical rules. Google Translate is as close to theory-free, data-driven algorithmic black box as we have – and it is, says Spiegelhalter, “an amazing achievement”. That achievement is built on the clever processing of enormous data sets.
But big data do not solve the problem that has obsessed statisticians and scientists for centuries: the problem of insight, of inferring what is going on, and figuring out how we might intervene to change a system for the better.
“We have a new resource here,” says Professor David Hand of Imperial College London. “But nobody wants ‘data’. What they want are the answers.”
To use big data to produce such answers will require large strides in statistical methods.
“It’s the wild west right now,” says Patrick Wolfe of UCL. “People who are clever and driven will twist and turn and use every tool to get sense out of these data sets, and that’s cool. But we’re flying a little bit blind at the moment.”
Statisticians are scrambling to develop new methods to seize the opportunity of big data. Such new methods are essential but they will work by building on the old statistical lessons, not by ignoring them.
Recall big data’s four articles of faith. Uncanny accuracy is easy to overrate if we simply ignore false positives, as with Target’s pregnancy predictor. The claim that causation has been “knocked off its pedestal” is fine if we are making predictions in a stable environment but not if the world is changing (as with Flu Trends) or if we ourselves hope to change it. The promise that “N = All”, and therefore that sampling bias does not matter, is simply not true in most cases that count. As for the idea that “with enough data, the numbers speak for themselves” – that seems hopelessly naive in data sets where spurious patterns vastly outnumber genuine discoveries.
“Big data” has arrived, but big insights have not. The challenge now is to solve new problems and gain new answers – without making the same old statistical mistakes on a grander scale than ever.

This article was first published in the FT Magazine, 29/30 March 2014. Read it in its original setting here.

Highlights

What next for behavioural economics?

The past decade has been a triumph for behavioural economics, the fashionable cross-breed of psychology and economics. First there was the award in 2002 of the Nobel Memorial Prize in economics to a psychologist, Daniel Kahneman – the man who did as much as anything to create the field of behavioural economics. Bestselling books were launched, most notably by Kahneman himself (Thinking, Fast and Slow , 2011) and by his friend Richard Thaler, co-author of Nudge (2008). Behavioural economics seems far sexier than the ordinary sort, too: when last year’s Nobel was shared three ways, it was the behavioural economist Robert Shiller who grabbed all the headlines.
Behavioural economics is one of the hottest ideas in public policy. The UK government’s Behavioural Insights Team (BIT) uses the discipline to craft better policies, and in February was part-privatised with a mission to advise governments around the world. The White House announced its own behavioural insights team last summer.
So popular is the field that behavioural economics is now often misapplied as a catch-all term to refer to almost anything that’s cool in popular social science, from the storycraft of Malcolm Gladwell, author of The Tipping Point (2000), to the empirical investigations of Steven Levitt, co-author of Freakonomics (2005).
Yet, as with any success story, the backlash has begun. Critics argue that the field is overhyped, trivial, unreliable, a smokescreen for bad policy, an intellectual dead-end – or possibly all of the above. Is behavioural economics doomed to reflect the limitations of its intellectual parents, psychology and economics? Or can it build on their strengths and offer a powerful set of tools for policy makers and academics alike?
A recent experiment designed by BIT highlights both the opportunity and the limitations of the new discipline. The trial was designed to encourage people to sign up for the Organ Donor Register. It was huge; more than a million people using the Driver and Vehicle Licensing Agency website were shown a webpage inviting them to become an organ donor. One of eight different messages was displayed at random. One was minimalist, another spoke of the number of people who die while awaiting donations, yet another appealed to the idea of reciprocity – if you needed an organ, wouldn’t you want someone to donate an organ to you?
BIT devoted particular attention to an idea called “social proof”, made famous 30 years ago by psychologist Robert Cialdini’s book Influence. While one might be tempted to say, “Too few people are donating their organs, we desperately need your help to change that”, the theory of social proof says that’s precisely the wrong thing to do. Instead, the persuasive message will suggest: “Every day, thousands of people sign up to be donors, please join them.” Social proof describes our tendency to run with the herd; why else are books marketed as “bestsellers”?
Expecting social proof to be effective, the BIT trial used three different variants of a social proof message, one with a logo, one with a photo of smiling people, and one unadorned. None of these approaches was as successful as the best alternatives at persuading people to sign up as donors. The message with the photograph – for which the teams had high hopes – was a flop, proving worse than no attempt at persuasion at all.
Daniel Kahneman, one of the fathers of behavioural economics, receiving an award from Barack Obama, November 2013
Three points should be made here. The first is that this is exactly why running trials is an excellent idea: had the rival approaches not been tested with an experiment, it would have been easy for well-meaning civil servants acting on authoritative advice to have done serious harm. The trial was inexpensive, and now that the most persuasive message is in use (“If you needed an organ transplant, would you have one? If so, please help others”), roughly 100,000 additional people can be expected to sign up for the donor register each year.
The second point is that there is something unnerving about a discipline in which our discoveries about the past do not easily generalise to the future. Social proof is a widely accepted idea in psychology but, as the donor experiment shows, it does not always apply and it can be hard to predict when or why.
This patchwork of sometimes-fragile psychological results hardly invalidates the whole field but complicates the business of making practical policy. There is a sense that behavioural economics is just regular economics plus common sense – but since psychology isn’t mere common sense either, applying psychological lessons to economics is not a simple task.
The third point is that the organ donor experiment has little or nothing to do with behavioural economics, strictly defined. “The Behavioural Insights Team is widely perceived as doing behavioural economics,” says Daniel Kahneman. “They are actually doing social psychology.”
. . .
The line between behavioural economics and psychology can get a little blurred. Behavioural economics is based on the traditional “neoclassical” model of human behaviour used by economists. This essentially mathematical model says human decisions can usefully be modelled as though our choices were the outcome of solving differential equations. Add psychology into the mix – for example, Kahneman’s insight (with the late Amos Tversky) that we treat the possibility of a loss differently from the way we treat the possibility of a gain – and the task of the behavioural economist is to incorporate such ideas without losing the mathematically-solvable nature of the model.
Why bother with the maths? Consider the example of, say, improving energy efficiency. A psychologist might point out that consumers are impatient, poorly-informed and easily swayed by what their neighbours are doing. It’s the job of the behavioural economist to work out how energy markets might work under such conditions, and what effects we might expect if we introduced policies such as a tax on domestic heating or a subsidy for insulation.
It’s this desire to throw out the hyper-rational bathwater yet keep the mathematically tractable baby that leads to difficult compromises, and not everyone is happy. Economic traditionalists argue that behavioural economics is now hopelessly patched-together; some psychologists claim it’s still attempting to be too systematic.
Nick Chater, a psychologist at Warwick Business School and an adviser to the BIT, is a sympathetic critic of the behavioural economics approach. “The brain is the most rational thing in the universe”, he says, “but the way it solves problems is ad hoc and very local.” That suggests that attempts to formulate general laws of human behaviour may never be more than a rough guide to policy.
This shift to radical incrementalism is so much more important than some of the grand proposals out there
The most well-known critique of behavioural economics comes from a psychologist, Gerd Gigerenzer of the Max Planck Institute for Human Development. Gigerenzer argues that it is pointless to keep adding frills to a mathematical account of human behaviour that, in the end, has nothing to do with real cognitive processes.
I put this critique to David Laibson, a behavioural economist at Harvard University. He concedes that Gigerenzer has a point but adds: “Gerd’s models of heuristic decision-making are great in the specific domains for which they are designed but they are not general models of behaviour.” In other words, you’re not going to be able to use them to figure out how people should, or do, budget for Christmas or nurse their credit card limit through a spell of joblessness.
Richard Thaler of the University of Chicago, who with Kahneman and Tversky is the founding father of behavioural economics, agrees. To discard the basic neoclassical framework of economics means “throwing away a lot of stuff that’s useful”.
For some economists, though, behavioural economics has already conceded too much to the patchwork of psychology. David K Levine, an economist at Washington University in St Louis, and author of Is Behavioral Economics Doomed? (2012), says: “There is a tendency to propose some new theory to explain each new fact. The world doesn’t need a thousand different theories to explain a thousand different facts. At some point there needs to be a discipline of trying to explain many facts with one theory.”
The challenge for behavioural economics is to elaborate on the neoclassical model to deliver psychological realism without collapsing into a mess of special cases. Some say that the most successful special case comes from Harvard’s David Laibson. It is a mathematical tweak designed to represent the particular brand of short-termism that leads us to sign up for the gym yet somehow never quite get around to exercising. It’s called “hyperbolic discounting”, a name that refers to a mathematical curve, and which says much about the way behavioural economists represent human psychology.
The question is, how many special cases can behavioural economics sustain before it becomes arbitrary and unwieldy? Not more than one or two at a time, says Kahneman. “You might be able to do it with two but certainly not with many factors.” Like Kahneman, Thaler believes that a small number of made-for-purpose behavioural economics models have proved their worth already. He argues that trying to unify every psychological idea in a single model is pointless. “I’ve always said that if you want one unifying theory of economic behaviour, you won’t do better than the neoclassical model, which is not particularly good at describing actual decision making.”
. . .
Meanwhile, the policy wonks plug away at the rather different challenge of running rigorous experiments with public policy. There is something faintly unsatisfying about how these policy trials have often confirmed what should have been obvious. One trial, for example, showed that text message reminders increase the proportion of people who pay legal fines. This saves everyone the trouble of calling in the bailiffs. Other trials have shown that clearly-written letters with bullet-point summaries provoke higher response rates.
None of this requires the sophistication of a mathematical model of hyperbolic discounting or loss aversion. It is obvious stuff. Unfortunately it is obvious stuff that is often neglected by the civil service. It is hard to object to inexpensive trials that demonstrate a better way. Nick Chater calls the idea “a complete no-brainer”, while Kahneman says “you can get modest gains at essentially zero cost”.
David Halpern, a Downing Street adviser under Tony Blair, was appointed by the UK coalition government in 2010 to establish the BIT. He says that the idea of running randomised trials in government has now picked up steam. The Financial Conduct Authority has also used randomisation to develop more effective letters to people who may have been missold financial products. “This shift to radical incrementalism is so much more important than some of the grand proposals out there,” says Halpern.
Not everyone agrees. In 2010, behavioural economists George Loewenstein and Peter Ubel wrote in The New York Times that “behavioural economics is being used as a political expedient, allowing policy makers to avoid painful but more effective solutions rooted in traditional economics.”
For example, in May 2010, just before David Cameron came to power, he sang the praises of behavioural economics in a TED talk. “The best way to get someone to cut their electricity bill,” he said, “is to show them their own spending, to show them what their neighbours are spending, and then show what an energy-conscious neighbour is spending.”
But Cameron was mistaken. The single best way to promote energy efficiency is, almost certainly, to raise the price of energy. A carbon tax would be even better, because it not only encourages people to save energy but to switch to lower-carbon sources of energy. The appeal of a behavioural approach is not that it is more effective but that it is less unpopular.
Thaler points to the experience of Cass Sunstein, his Nudge co-author, who spent four years as regulatory tsar in the Obama White House. “Cass wanted a tax on petrol but he couldn’t get one, so he pushed for higher fuel economy standards. We all know that’s not as efficient as raising the tax on petrol – but that would be lucky to get a single positive vote in Congress.”
Should we be trying for something more ambitious than behavioural economics? “I don’t know if we know enough yet to be more ambitious,” says Kahneman, “But the knowledge that currently exists in psychology is being put to very good use.”
Small steps have taken behavioural economics a long way, says Laibson, citing savings policy in the US. “Every dimension of that environment is now behaviourally tweaked.” The UK has followed suit, with the new auto-enrolment pensions, directly inspired by Thaler’s work.
Laibson says behavioural economics has only just begun to extend its influence over public policy. “The glass is only five per cent full but there’s no reason to believe the glass isn’t going to completely fill up.”

First published on FT.com, Life and Arts, 22 March 2014

Other Writing

Economic quackery and political humbug

British readers will be well aware that the UK Chancellor George Osborne unveiled his budget statement on Wednesday. Here was the piece I wrote that afternoon:

Has there ever been a chancellor of the exchequer more entranced by the game of politics? Most of George Osborne’s Budget speech was trivial. Some of it was imponderable. The final flurry of punches was substantial. Every word was political.

Consider the substantial first: in abolishing the obligation for pensioners to buy annuities, Mr Osborne has snuck up behind an unpopular part of the financial services industry and slugged it with a sock full of coins. (No doubt he will tell us they were minted in memory of the threepenny bit and in honour of Her Majesty the Queen.) This is a vigorous but carefully calibrated tummy rub for sexagenarians with substantial private-sector pensions.

Nobody else will even understand what has been done. The benefit to pensioners is immediate and real. The cost comes later – but Mr Osborne will be long gone by the time the media begin to wring their hands about some poor pensioner who blew his retirement savings on a boiler-room scam.

His other significant moves were equally calculated. A meaningless and arbitrary cap on the welfare budget is no way to rationalise the welfare state but it is a splendid way to tie Labour in knots. A new cash Isa allowance of £15,000 will benefit only the prosperous, and has political appeal while delivering no real benefit – and no real cost to the Treasury – until long after the 2015 election.

Next, the imponderable. Mr Osborne devoted substantial time to the forecasts of the Office for Budget Responsibility, and no wonder: at last the news is good. But while the OBR is independent it is not omniscient. Like other economic forecasters, it has been wrong before and will be again. Mr Osborne forgot this and spoke of growth in future years being “revised up”. This is absurd. The OBR does not get to decide what growth in future years will be. We can draw mild encouragement from its improved forecasts, nothing more.

Finally, the trivial. Any chancellor must master the skill of announcing policies that have little or no place in the macroeconomic centrepiece of the political calendar. Mr Osborne showed no shame. The news that, for example, police officers who die in the line of duty will pay no inheritance tax is appealing but irrelevant. Police deaths are blessedly rare and, since police officers are usually young and modestly paid, inheritance tax is usually a non-issue even in these rare tragedies.

So let us applaud Mr Osborne for playing his own game well – a game in which economic logic is an irritation, the national interest is a distraction and party politics is everything.

You can read this comment in context at FT.com

Other Writing

The Royal accounts are printed in red and gold

The monarchy costs the same as the milk the nation pours on its cereal, says Tim Harford

‘Britain’s Royal Household spent more than it received last year and is doing too little to improve the management of its finances, a parliamentary watchdog says.’, Financial Times, January 28

What – parliamentarians have condemned deficit spending and poor financial management?

They are the experts on such things, I am sure. This is Margaret Hodge MP’s public accounts committee at work. It has a reputation for shaking things up but I’ve never been able to take Ms Hodge seriously since her complaints about a cap on housing benefit.

What was risible about a cap on housing benefit?

Nothing risible about that as such. But because the cap would particularly affect Londoners claiming the benefit, Ms Hodge was among those complaining that it would change the shape of London. She called it “a massive demographic and social upheaval the likes of which have never been seen before”. Since the London mayor’s office – which also opposed the policy – estimated that fewer than 0.2 per cent of the capital’s families would have to move home as a result, that suggests an alarmingly shaky grasp of the numbers for someone whose job is to oversee value for money in public spending.

You’re an unforgiving sort. In any case, Ms Hodge’s committee is concerned about the way the world’s largest housing benefit cheque is being spent.

Yes, the Royal Household receives £31m – a slice of the income from the Crown Estate.

Isn’t that the Queen’s money?

The Estate is nominally the property of the Queen but George III signed over its revenue to parliament.

Wasn’t he the mad one?

Not at the time he gave up the revenues from the Crown Estate. In any case, the current arrangement is only a couple of years old. The Royal Household gets 15 per cent of the income from the Crown Estate. That income, I might add, is sharply rising.

So all this talk of the Queen being down to her last million is nonsense?

It is obviously jolly amusing and has provoked many enormously original jokes. The Queen’s cash reserve has indeed fallen to £1m – not much relative to the scale of the spending required to run the Royal Household. But since income is rising, both from the Crown Estate and from admissions to the likes of Windsor Castle, and spending has been steadily falling, the Royal Household is about to go into surplus. That is more than you can say for the government.

But you can understand why parliament takes an interest. There’s serious money at stake. Think of the hospital beds you could provide for £31m.

Oh, absolutely. You could keep the English National Health Service running for almost three hours for that kind of money.

I’m getting the sense that you’re a monarchist.

Not particularly, but one thing I’m sure of is that the case for or against the monarchy can’t depend on £31m, which is roughly the cost of the milk the nation pours on its cornflakes each morning, plus a bit of tea and toast. This has been reported in all the papers for roughly the same reason that Kim Kardashian’s latest celebrity exploits are reported everywhere.

Why is that, by the way?

Because we’re all monkeys, and we’re fascinated by other monkeys with higher status than us.

But the public accounts committee thinks the real monkeys are the ones in charge of maintaining Royal Household properties – that whoever is in charge of electrical repairs, repointing the bricks, that sort of thing, has been letting things fall into ruin.

Yes – the committee’s view is also that the Royal Household needs to take responsibility for itself and the Treasury needs to take responsibility for it; that it is successfully saving money but should save more; and that it is successfully raising money from tourists but should raise more from tourists. Basically, just think really hard about whatever you already believe is true about the Royal Household, and I am sure I can spare you the trouble of reading the committee’s report.

And do you agree that maintenance of Household properties is lax?

I don’t know because they’ve never invited me in to poke around the plumbing. But it wouldn’t surprise me. There isn’t much competition, and as the great economist John Hicks said: “The best of all monopoly profits is a quiet life.” That may well be how the courtiers felt – until the public accounts committee came along.

Also published at ft.com.

Previous
Do NOT follow this link or you will be banned from the site!