(perhaps only tangentially useful, but interesting!)
What Research Tells Us About Making
Accurate Predictions
by Walter Frick
Harvard Business
Review
February 2, 2015
“Prediction
is very difficult,” the old chestnut goes, “especially about the future.” And
for years, social science agreed. Numerous studies detailed the forecasting
failures of even so-called experts. Predicting the future is just too hard, the
thinking went; HBR even published an article about how the art of
forecasting wasn’t really about prediction at all.
That’s
changing, thanks to new research.
We know
far more about prediction than we used to, including the fact that some of us
are better at it than others. But prediction is also a learned skill, at least
in part — it’s something that we can all become better at with practice.
And that’s good news for businesses, which have tremendous incentives to
predict a myriad of things.
The
most famous research on prediction was done by Philip Tetlock of the University
of Pennsylvania, and his seminal 2006 book Expert Political
Judgment provides crucial background. Tetlock asked a group of
pundits and foreign affairs experts to predict geopolitical events, like
whether the Soviet Union would disintegrate by 1993. Overall, the “experts”
struggled to perform better than “dart-throwing chimps”, and were consistently
less accurate than even relatively simple statistical algorithms. This was true
of liberals and conservatives, and regardless of professional credentials.
But
Tetlock did uncover one style of thinking that seemed to aid prediction. Those
who preferred to consider multiple explanations and balance them together
before making a prediction performed better than those who relied on a single
big idea. Tetlock called the first group foxes and the second group hedgehogs, after an essay
by Isaiah Berlin. As Tetlock writes:
The
intellectually aggressive hedgehogs knew one big thing and sought, under the
banner of parsimony, to expand the explanatory power of that big thing to
“cover” new cases; the more eclectic foxes knew many little things and were
content to improvise ad hoc solutions to keep pace with a rapidly changing
world.
Since
the book, Tetlock and several colleagues have been running a series of
geopolitical forecasting tournaments (which I’ve dabbled in) to discover what
helps people make better predictions. Over the last six months, Tetlock,
Barbara Mellers, and several of their Penn colleagues have released three new
papers analyzing 150,000 forecasts by 743 participants (all with at least a
bachelor’s degree) competing to predict 199 world events. One paper
focuses solely on high-performing “super forecasters”; another looks at the
entire group; and a third makes the case for forecasting tournaments as a
research tool.
The
main finding? Prediction isn’t a hopeless enterprise— the tournament
participants did far better than blind chance. Think about a prediction with
two possible outcomes, like who will win the Super Bowl. If you pick at random,
you’ll be wrong half the time. But the best forecasters were consistently able
to cut that error rate by more than half. As Tetlock put it to me, “The best
forecasters are hovering between the chimp and God.”
Perhaps
most notably, top predictors managed to improve over time, and several
interventions on the part of the researchers improved accuracy. So the second
finding is that it’s possible to get better at prediction, and the research
offers some insights into the factors that make a difference.
Intelligence
helps. The
forecasters in Tetlock’s sample were a smart bunch, and even within that sample
those who scored higher on various intelligence tests tended to make more
accurate predictions. But intelligence mattered more early on than it did by
the end of the tournament. It appears that when you’re entering a new domain
and trying to make predictions, intelligence is a big advantage. Later, once
everyone has settled in, being smart still helps but not quite as much.
Domain
expertise helps, too. Forecasters who scored better on a test of political knowledge
tended to make better predictions. If that sounds obvious, remember that
Tetlock’s earlier research found little evidence that expertise matters. But
while fancy appointments and credentials might not have correlated with good
prediction in earlier research, genuine domain expertise does seem to.
Practice
improves accuracy. The top-performing “super forecasters” were consistently more
accurate, and only became more so over time. A big part of that seems to be
that they practiced more, making more predictions and participating more in the
tournament’s forums.
Teams
consistently outperform individuals. The researchers split forecasters up
randomly, so that some made their predictions on their own, while others did so
as part of a group. Groups have their own problems and biases, as a recent HBR
article explains, so the researchers gave the groups training on how
to collaborate effectively. Ultimately, those who were part of a group made
more accurate predictions.
Teamwork
also helped the super forecasters, who after Year 1 were put on teams with each
other. This only improved their accuracy. These super-teams were unique in one
other way: as time passed, most teams became more divided in their opinions, as
participants became entrenched in their beliefs. By contrast, the super forecaster
teams agreed more and more over time.
More
open-minded people make better predictions. This harkens back to
Tetlock’s earlier distinction between foxes and hedgehogs. Though participants’
self-reported status as “fox” or “hedgehog” didn’t predict accuracy, a commonly
used test of open-mindedness did. While some psychologists see open-mindedness
as a personality trait that’s static within individuals over time, there is
also some evidence
that each of us can be more or less open-minded depending on the circumstances.
Training
in probability can guard against bias. Some of the forecasters were given training in
“probabilistic reasoning,” which basically means
they were told to look for data on how similar cases had turned out in the past
before trying to predict the future. Humans are surprisingly bad at this, and
tend to overestimate the chances that the future will be different than the
past. The forecasters who received this training performed better than those
who did not. (Interestingly, a smaller group were trained in scenario planning, but
this turned out not to be as useful as the training in probabilistic
reasoning.)
Rushing
produces bad predictions. The longer participants deliberated before
making a forecast, the better they did. This was particularly true for those
who were working in groups.
Revision
leads to better results. This isn’t quite the same thing as open-mindedness, though it’s
probably related. Forecasters had the option to go back later on and revise
their predictions, in response to new information. Participants who revised
their predictions frequently outperformed those who did so less often.
Together
these findings represent a major step forward in understanding forecasting.
Certainty is the enemy of accurate prediction, and so the unstated prerequisite
to forecasting may be admitting that we’re usually bad at it. From there, it’s
possible to use a mix of practice and process to improve.
However,
these findings don’t speak to one of the central findings of Tetlock’s earlier
work: that humans typically made worse predictions than algorithms. Other
research has found that one reliable way to boost humans’ forecasting ability
is to teach them to defer to statistical models whenever possible. And the
“probabilistic training” described above really just involves teaching humans
to think like simple algorithms.
You
could argue that we’re learning how to make better predictions just in time to
be eclipsed in many domains by machines, but the real challenge will be in
blending the two. Tetlock’s paper on the merits of forecasting tournaments is
also about the value of aggregating the wisdom of the crowd using algorithms.
Ultimately, a mix of data and human intelligence is likely to outperform either
on its own. The next challenge is finding the right algorithm to put them
together.