Today’s xkcd is a work of subtle comic genius. The strip’s regular readers will all understand it as a reference to a debate about validity of FiveThirtyEight’s election-forecasting technique. For anyone else, read on.
For stat-heads like me, the most exciting part of Tuesday’s vote was not the outcome itself, because anyone who looked at the data seriously realized that President Obama was an overwhelming favorite for re-election. Rather, it was the resolution of a silly debate between the innumerate majority of the blogosphere* and media and a small number of data-based forecasters, most prominently Nate Silver of FiveThirtyEight. Tons of people called the election a “tossup”, and critiqued Silver for calling Obama the favorite. A lot of this exhibited a misunderstanding of statistics. A 90% chance of victory does not mean that Obama will win for sure. It means that in 9 out of 10 runs of the election we’d expect him to win, with the remaining 1 in 10 being due to last-second shifts in the electorate toward Romney compared with the polls.
So how did Silver do? Unfortunately I can’t find a way to permalink to the results, but as of 9:10 AM Central African Time on November 7th, MSNBC’s results have Silver correctly predicting every single state that has been called, and correctly forecasting the leader in the states yet to be decided. That includes Florida, where a naive state-level polling average predicted a Romney win – but Silver adjusted those polls to account for recent movement toward Obama nationally.
This is an absolutely dominating performance. It is a triumph for the use of data over hand-waving, and for statistics as a field. If anything, Silver’s track record is such that his predictions were too modest. For many swing states, he only predicted the victor had a barely-better-than-50% chance of winning. If he was correctly judging his margin of error, he’d have made a couple of mistakes tonight. He was hedging against the chance that the polls were biased but in fact these days polls are pretty damned accurate.
Tomorrow morning, I want to see virtually every political commentator and all those crazy dudes running websites forecasting Romney landslides to admit that they were wrong, and that analyzing the data and using it to form the best possible forecasts not only works, but works spectacularly well.
*There were also critiques by fairly smart people, pointing out Silver’s lack of transparency and arguing that using 3 significant digits for the probability of an Obama win is unreasonable. This post is not about them.
My friend, you have a terribly underdeveloped understanding of media bias. Known outcomes are bad for ratings. The media is and always will be biased toward cliffhangers.
The polls worked fine this time, but there were major statistical flub ups of various sorts in 2000 and 2004. I doubt the statistics of survey research have really seen much progress since then, so it makes sense to remain cautious of it. For instance, I posted a link to a poll of ‘likely’ voters in Michigan in which 98.2% of people surveyed claimed they were ‘definitely’ voting. These uber-civically-involved folks apparently were those who constituted the survey sample. Strikes me as an object lesson in the perils of stated preference estimates.
>The polls worked fine this time, but there were major statistical flub ups of various sorts in 2000 and 2004.
This is a good point – in the Princeton Election Consortium post I linked to in the footnote one commenter said that Silver should be judged on whether his 51% Democrat winner state predictions come out true 51% of the time. i don’t like that method – it discounts the possibility of an electoral Black Swan, which his forecasts explicitly try to account for. E.g. the eleventh-hour reveal of Dubya’s DUI arrest in 2000, which pushed polls toward Gore. On the other hand, the size of the poll average misses has dropped a lot since the 1980s. We really were a lot worse at doing surveys back then; people used to change question wording all the time, for example.
The thing about self-reported voting intentions is that that question isn’t incentive-compatible. There’s a lot of shame in admitting you don’t plan to vote. But it might be a decent variable to use in a likely-voter model; if you’re willing to *admit* you’re not voting then you’re probably not going to vote.
Educated over-report.
Flowing Data has a nice chart showing just how good his predictions were. http://flowingdata.com/2012/11/07/how-silver-predictions-performed/
I’d like to see a plot of his point predictions for state-level vote shares versus the reality, but not enough to try to make it myself.
Agreed completely. I’m hoping he’ll do it himself.
He was so much more transparent and easy to use back in 08, I could have just copied the numbers and cleaned them up in Notepad++. Now it’s all glitz and I can’t find any of the internals or raw numbers.