This blog post is a short introduction to the science of prediction which is a topic that I have been totally immersed in over the last new months and recently presented about at the 2014 ESOMAR Congress with Hubertus Hofkirchner. I thought I would share some of what I have learned.
The accuracy of any prediction is based roughly around this formula...
P Accuracy = Quality of information x Effort put into making the prediction x (1 - difficulty of accurately aggregating all the dependent variables) x The level of Objectivity with which you can do this x The pure randomness of the event
P = QxEx(1-D)xOxR
Here is the thinking behind this:
- If you have none of the right information your prediction will be unreliable
- If you don't put any effort into processing the information your prediction may be be unreliable
- The more complex a task it is to weigh up and analyse the information needed to make a prediction the less likely that the prediction will be correct
- Unless you stand back from the prediction and look at things objectively then your prediction could be subject to biases which to lead to you making an inaccurate prediction
- Ultimately prediction accuracy is capped by the randomness of the event. For example predicting the outcome of tossing a coin 1 time v 10,000 times have completely different levels of prediction reliability.
Realize that prediction accuracy is not directly linked to sample size
You might note as a market researcher, that this formula is not directly dependent on sample size i.e. one person with access to the right information, who is prepared to put in enough effort, has the skills needed to process this data and is able to remain completely objective, can make as good a prediction as a global network of market research companies interviewing millions of people on the same subject! I cite as an example of this Nate Silver's achievement of single handedly predicting all 52 US State 2012 election results.
Now obviously we are not all as smart as Nate Silver, we don't have access to as much information, few of us would be prepared to put in the same amount of effort and many of us many not be able to process up this information as objectively.
So it does help to have more than 1 person involved to ensure that the errors caused by one person's lack of info or another person's lack of effort or objectivity can be accounted for.
So how many people do you need to make a prediction?
Now this is a good question, the answer obviously is that it depends.
It firstly depends on how much expertise the people making a prediction have on the subject individually and how much effort they are prepared to make. If they all know their stuff or are prepared to do some research and put some thought into it, then you need a lot less than you might think.
16 seems to be about the idea size of an active intelligent prediction group
In 2007, Jed Christiansen of the University of Buckingham took a look. He used a future event with very little general coverage and impact, rowing competitions, and asked participants to predict the winners. A daunting task, as there are no clever pundits airing their opinions in press, like in soccer. However Christiansen recruited his participant pool from the teams and their (smallish) fan base through a rowing community website, in other words, he found experts. He found that the magic number was as little as “16”. Markets with 16 traders or more were well-calibrated, below that number prices could not be driven far enough.
The Iowa Electronic Market, which is probably the most famous of prediction systems out there that has successfully been used to predict over 600 elections, has I understand involved an average of less than 20 traders per prediction.
Taking account of ignorance
Taking account of cognitive bias
It gets even harder if you start to take into account cognitive biases of the random sample. For example just by asking whether you think it will rain tomorrow more people will randomly say yes than no because of latent acquiescence bias. We have tested this out in experiments for example if you ask people to predict how many wine drinkers prefer red wine the prediction will be 54%, if you ask people to predict how many wine drinkers prefer white wine the number of people who select red wine drops to 46%. So its easy to see how this cognitive bias like this make predicting things difficult .
In the above example predicting the weather this effect would instantly cancel out the opinions of the experts and no matter how many people you interviewed you would never be able to get an accurate weather forecast prediction from the crowd unless you accounted for this bias.
This is just one of a number of biases that impact on the accuracy of our predictions, one of the worst being our emotions.
Asking a Manchester United football fan to predict the result of their team's match is nye on useless as it almost impossible for them to envisage losing a match due to their emotional attachment to the team.
This makes political predictions particularly difficult.
Prediction biases can be introduced simply as a result of how you ask the question
Picking the right aggregation process
The importance of measuring prediction confidence
In the world of prediction it's all about working out how to differentiate the good and bad predictors and one of the simplest techniques to do this is simply to ask people how confident they are in their prediction.
For example if I had watched the weather forecast I would be a lot more confident in predicting tomorrow's weather than if I had not. So it would be sensible when asking people to predict tomorrows weather to ask them if they had seen the weather forecast and how confident they were. From this information you could easily isolate out the "signal" from the "noise"
The trick is with all prediction protocols to try and find a way of isolating the people that are better informed than others and better at objectively analyzing that information but in most cases its not as easy as asking if they have seen the weather forecast.
For more complex predictions like predicting the result of a sports match, prediction confidence and prediction accuracy is not a direct linear relationship but certainly confidence weighting can help but needs to be carefully calibrated. How you go about this it a topic for another blog post.
In the mean time if you are interested in finding out more about prediction science read our recently published ESOMAR paper titled Predicting the future.