Brexit referendum: method and benchmarks
Our
prediction method (announced in the previous text) rests primarily upon our Facebook survey, where we use a
variety of Bayesian updating methodologies to filter out internal biases in
order to offer the most precise prediction. In essence we ask our participants
to express their preference of who they will vote for (e.g. Leave or Remain for
the Brexit referendum), how much do they think their preferred choice will get
(in percentages), and how much do they think other people will estimate that
Leave or Remain could get. Depending on how well they estimate the prediction
possibilities of their preferred choices we will formulate their predictive
power and give higher weight to the better predictors. We call this method the Bayesian Adjusted Facebook Survey (BAFS).
In our
first election prediction attempt, where we predicted the results of the 2015
general elections in Croatia, we found that our adjusted Facebook poll (AFP)[1]
beat all other methods (ours and of other pollsters) and by a significant
margin. Not only did it correctly predict the rise of a complete outlier in
those elections, it also gave the closest estimates of the number of seats each
party got. Our standard method, combining bias-adjusted polls and
socio-economic data, projected a 9 seat difference between the two main
competitors (67 to 58; in reality the result was 59-56), and a rather modest
result of the outlier party which was projected to be third with 8 seats – it
got 19 instead. Had we used the AFP we would have given 16 seats to the third
party, and a much closer relationship between the first two parties (60-57).
The remarkable success of the method, particularly given that it operated in a
multiparty environment with roughly 10 parties with realistic chances of
entering parliament (6 of which competed for the third party status, all newly
founded within a year of the elections, with no prior electoral data),
encouraged us to improve it further which is why we tweaked it into the BAFS.
In addition
to our central prediction method we will use a series of benchmarks for comparison with our BAFS method. We
hypothesize (quite modestly) that the BAFS will beat them all.
We will use
the following benchmarks:
Adjusted polling average – we examine all the relevant pollster
companies in the UK given their past performance in predicting the outcomes of
the past two general elections (2015 and 2010), and one recent referendum – the
Scottish independence referendum in 2014. We could go longer back in time and
take into consideration the polls of local elections as well. However, we
believe the more recent elections adequately encapsulate the shift in trend
regarding polling methods in addition to its contemporary downsides. As far as
local elections are concerned, we fear that they tend to be too specific and
that predicting local outcomes and national outcomes are two different things.
More precision on a local level need not translate into more precision on a
national level. Given that the election of our concern is national (the EU
referendum), it makes sense to focus only on the performance of national-level
polls in the past. We are however open to discussion regarding this assumption.
In total we
covered 516 polls from more than 20 different pollsters in the UK across the
selected elections. Each pollster has been ranked according to its precision.
The precision ranking is determined on a time scale (predictions closer to the
election carry a greater weight) and a simple brier score is calculated to
determine the forecasting accuracy of each pollster. Based on this ranking
weights are assigned to all the pollsters. To calculate the final adjusted
polling average we take all available national polls, adjust them according to
timing, their sample size, whether or not they conduct an online or telephone
poll, and their pre-determined ranking weight, and take the average score from
all those weights. We also calculate the probability distribution for our final
adjusted polling average.
Regular polling average – this will be the same as above,
except it won’t be adjusted for any prior bias of the given poll nor will it be
adjusted based on sample size. It is only adjusted based on timing (the more
recent get a greater weight). We look at all the polls done at least two months
before the last poll.
What UK Thinks Poll of polls – this is a poll averaging only the
six most recent polls, done by a non-partisan website What UK Thinks, run by the NatCen
Social Research
agency. The structure of what goes in changes each week as new pollsters share
new polling data. The method is simple averaging (it shows moving averages)
without weighting anything. Here’s the intuition.
Forecasting polls – these are polls based on asking the people
to estimate how much one choice will get over another. They are different than
regular polls as they don’t ask who you will vote for, but who you think the rest of the country will vote
for. The information for this estimate is also gathered via the What UK Thinks website (see sample questions here, here, here, and here).
Prediction markets – we use a total of seven betting markets. We
use the estimates from PredictIt, PredictWise, Betfair, Pivit, Hypermind, Ig,
and iPredict. They are also distributed on a time scale where recent
predictions are given a greater weight. Each market is also given a weight
based on the volume of trading, so that we can calculate and compare one single
prediction from all the markets (as we do with the polls). The prediction
markets, unlike the regular polls, don’t produce estimates of the total
percentage one option will have over another. They instead offer probabilities
that a given outcome will occur, so the comparison with the BAFS will be done
purely on the basis of the probability distributions of an outcome.
Prediction models – if any. The idea is to examine the results
of prediction models such as the ones done by Nate Silver and FiveThrityEight.
However, so far FiveThirtyEight hasn’t done any predictions on the UK Brexit
referendum (I guess they are preoccupied with the US primary and are probably
staying away from the UK for now, given their poor result at the 2015 general election).
One example of such models based purely on socio-economic data (without taking
into consideration any polling data, so quite different from Silver) is the one
done by a UK political science professor Matt Qvortrup where he racks it all up into a simple
equation: Support for EU= 54.4 +
Word-Dummy*11.5 + Inflation*2. – Years in Office*1.2.[2]
Accordingly, his prediction is a 53.9% for the UK to Remain. We will try to
find more such efforts to compare our method with.
Superforcaster WoC – we utilize the wisdom of the superforecaster
crowd. Superforecasters are a colloquial term for participants in Phillip
Tetlock’s Good Judgement Project (GJP) (there’s even a book about them). The
GJP was a part of a wider forecasting tournament organized by the US government
agency IARPA following the intelligence community fiasco regarding the WMDs in
Iraq. The government wanted to find whether or not there exists a more
formidable way of making predictions. The GJP crowd (all volunteers, regular
people, seldom experts) significantly outperformed everyone else several years
in a row. Hence the title – superforecasters (there’s a number of other
interesting facts about them – read more here, or buy the book). However superforecatsers
are only a subset of more than 5000 forecasters who participate in the GJP.
Given that we cannot really calculate and average out the performance of the
top predictors within that crowd, we have to take the collective consensus forecast. Finally, similar to the betting
markets, the GJP project doesn’t ask its participants to predict the actual
voting percentage, it only asks them to gauge the probability of an event
occurring. We therefore only compare the probability numbers in this benchmark.
Finally, we
will calculate the mean of all the given benchmarks. That will be the final, last
robustness test of the BAFS method.
So far, one
month before the elections here is the rundown of the benchmark methods: (these
will be updated over time)
Method
|
Remain
|
Leave
|
Adjusted
polling average*
|
50.5
|
47.16
|
Regular
polling average*
|
51.04
|
46.99
|
Poll
of polls
|
54
|
46
|
Prediction
models
|
53.9
|
46.1
|
Mean
|
52.36
|
46.56
|
Note:
updated as of 23rd May 2016.
The
following table expresses it in terms of probabilities:
Method
|
Remain
|
Leave
|
Adjusted
polling average
|
66.89
|
33.11
|
Regular
polling average
|
67.13
|
32.87
|
Forecasting
polls*
|
62.98
|
31.05
|
Prediction
markets
|
74.85
|
25.15
|
Superforecaster
WoC
|
77
|
23
|
Mean
|
69.77
|
29.04
|
Note:
updated as of 23rd May 2016.
* For the adjusted polling average, the regular
polling average, and for the forecasting polls we have factored in the
undecided voters as well.
[1] Note: this is not the same method
as we use now, even though it was quite similar.
[2] See his paper(s) for further
clarification.
Comments
Post a Comment