Monday, 30 May 2016

Brexit referendum: method and benchmarks

Our prediction method (announced in the previous text) rests primarily upon our Facebook survey, where we use a variety of Bayesian updating methodologies to filter out internal biases in order to offer the most precise prediction. In essence we ask our participants to express their preference of who they will vote for (e.g. Leave or Remain for the Brexit referendum), how much do they think their preferred choice will get (in percentages), and how much do they think other people will estimate that Leave or Remain could get. Depending on how well they estimate the prediction possibilities of their preferred choices we will formulate their predictive power and give higher weight to the better predictors. We call this method the Bayesian Adjusted Facebook Survey (BAFS).

In our first election prediction attempt, where we predicted the results of the 2015 general elections in Croatia, we found that our adjusted Facebook poll (AFP)[1] beat all other methods (ours and of other pollsters) and by a significant margin. Not only did it correctly predict the rise of a complete outlier in those elections, it also gave the closest estimates of the number of seats each party got. Our standard method, combining bias-adjusted polls and socio-economic data, projected a 9 seat difference between the two main competitors (67 to 58; in reality the result was 59-56), and a rather modest result of the outlier party which was projected to be third with 8 seats – it got 19 instead. Had we used the AFP we would have given 16 seats to the third party, and a much closer relationship between the first two parties (60-57). The remarkable success of the method, particularly given that it operated in a multiparty environment with roughly 10 parties with realistic chances of entering parliament (6 of which competed for the third party status, all newly founded within a year of the elections, with no prior electoral data), encouraged us to improve it further which is why we tweaked it into the BAFS.

In addition to our central prediction method we will use a series of benchmarks for comparison with our BAFS method. We hypothesize (quite modestly) that the BAFS will beat them all.

We will use the following benchmarks:

Adjusted polling average – we examine all the relevant pollster companies in the UK given their past performance in predicting the outcomes of the past two general elections (2015 and 2010), and one recent referendum – the Scottish independence referendum in 2014. We could go longer back in time and take into consideration the polls of local elections as well. However, we believe the more recent elections adequately encapsulate the shift in trend regarding polling methods in addition to its contemporary downsides. As far as local elections are concerned, we fear that they tend to be too specific and that predicting local outcomes and national outcomes are two different things. More precision on a local level need not translate into more precision on a national level. Given that the election of our concern is national (the EU referendum), it makes sense to focus only on the performance of national-level polls in the past. We are however open to discussion regarding this assumption.

In total we covered 516 polls from more than 20 different pollsters in the UK across the selected elections. Each pollster has been ranked according to its precision. The precision ranking is determined on a time scale (predictions closer to the election carry a greater weight) and a simple brier score is calculated to determine the forecasting accuracy of each pollster. Based on this ranking weights are assigned to all the pollsters. To calculate the final adjusted polling average we take all available national polls, adjust them according to timing, their sample size, whether or not they conduct an online or telephone poll, and their pre-determined ranking weight, and take the average score from all those weights. We also calculate the probability distribution for our final adjusted polling average.

Regular polling average – this will be the same as above, except it won’t be adjusted for any prior bias of the given poll nor will it be adjusted based on sample size. It is only adjusted based on timing (the more recent get a greater weight). We look at all the polls done at least two months before the last poll.

What UK Thinks Poll of polls – this is a poll averaging only the six most recent polls, done by a non-partisan website What UK Thinks, run by the NatCen Social Research agency. The structure of what goes in changes each week as new pollsters share new polling data. The method is simple averaging (it shows moving averages) without weighting anything. Here’s the intuition. 

Forecasting polls – these are polls based on asking the people to estimate how much one choice will get over another. They are different than regular polls as they don’t ask who you will vote for, but who you think the rest of the country will vote for. The information for this estimate is also gathered via the What UK Thinks website (see sample questions here, here, here, and here).  

Prediction markets – we use a total of seven betting markets. We use the estimates from PredictIt, PredictWise, Betfair, Pivit, Hypermind, Ig, and iPredict. They are also distributed on a time scale where recent predictions are given a greater weight. Each market is also given a weight based on the volume of trading, so that we can calculate and compare one single prediction from all the markets (as we do with the polls). The prediction markets, unlike the regular polls, don’t produce estimates of the total percentage one option will have over another. They instead offer probabilities that a given outcome will occur, so the comparison with the BAFS will be done purely on the basis of the probability distributions of an outcome.

Prediction models – if any. The idea is to examine the results of prediction models such as the ones done by Nate Silver and FiveThrityEight. However, so far FiveThirtyEight hasn’t done any predictions on the UK Brexit referendum (I guess they are preoccupied with the US primary and are probably staying away from the UK for now, given their poor result at the 2015 general election). One example of such models based purely on socio-economic data (without taking into consideration any polling data, so quite different from Silver) is the one done by a UK political science professor Matt Qvortrup where he racks it all up into a simple equation: Support for EU= 54.4 + Word-Dummy*11.5 + Inflation*2. – Years in Office*1.2.[2] Accordingly, his prediction is a 53.9% for the UK to Remain. We will try to find more such efforts to compare our method with.  

Superforcaster WoC – we utilize the wisdom of the superforecaster crowd. Superforecasters are a colloquial term for participants in Phillip Tetlock’s Good Judgement Project (GJP) (there’s even a book about them). The GJP was a part of a wider forecasting tournament organized by the US government agency IARPA following the intelligence community fiasco regarding the WMDs in Iraq. The government wanted to find whether or not there exists a more formidable way of making predictions. The GJP crowd (all volunteers, regular people, seldom experts) significantly outperformed everyone else several years in a row. Hence the title – superforecasters (there’s a number of other interesting facts about them – read more here, or buy the book). However superforecatsers are only a subset of more than 5000 forecasters who participate in the GJP. Given that we cannot really calculate and average out the performance of the top predictors within that crowd, we have to take the collective consensus forecast. Finally, similar to the betting markets, the GJP project doesn’t ask its participants to predict the actual voting percentage, it only asks them to gauge the probability of an event occurring. We therefore only compare the probability numbers in this benchmark.

Finally, we will calculate the mean of all the given benchmarks. That will be the final, last robustness test of the BAFS method.

So far, one month before the elections here is the rundown of the benchmark methods: (these will be updated over time)

Method
Remain
Leave
Adjusted polling average*
50.5
47.16
Regular polling average*
51.04
46.99
Poll of polls
54
46
Prediction models
53.9
46.1
Mean
52.36
46.56
                                   Note: updated as of 23rd May 2016.

The following table expresses it in terms of probabilities:

Method
Remain
Leave
Adjusted polling average
66.89
33.11
Regular polling average
67.13
32.87
Forecasting polls*
62.98
31.05
Prediction markets
74.85
25.15
Superforecaster WoC
77
23
Mean
69.77
29.04
                                   Note: updated as of 23rd May 2016.

* For the adjusted polling average, the regular polling average, and for the forecasting polls we have factored in the undecided voters as well.




[1] Note: this is not the same method as we use now, even though it was quite similar. 
[2] See his paper(s) for further clarification. 

No comments:

Post a Comment