Thursday, November 8, 2012

Comment: Sabermetrics and Political Forecasting

I have steadfastly avoided off topic posts on this site and fully intend to continue to do so.   There is a fascinating subplot to the just completed Presidential election that has a baseball connection which I would like to call people's attention to, if you are not already aware of it.  The discussion will be limited to the science of politics.  I will strive to keep it non-partisan.  Policy discussions will be avoided.

Nate Silver is a baseball sabermetrician, or at least he was one until recently.  He made a name for himself in baseball sabermetric circles by developing a forecasting system for player performance that he called by an acronym, PECOTA(Player Empirical Comparison and Optimization Test Algorithm).  In 2003, Silver became a regular contributor to Baseball Prospectus and soon after sold PECOTA to BP for a partnership interest.  These projections are eagerly awaited each spring by baseball fans, possibly more so by fantasy baseball players hoping for an edge in their fantasy drafts.

In 2007, Silver began writing analyses and predictions of the 2008 Presidential campaign under the pseudonym "Poblano" for the Daily Kos, a liberal political website.  In March of 2008, he established his own website FiveThirtyEight.com and revealed his true identity later that summer.  He quickly gained notoriety in the field of politics when he correctly predicted the outcome of the Presidential contest in 49 of 50 states and in all 35 Senate races.

Silver left Baseball Prospectus in 2009 to devote full time to his political blog which was licensed for publication by the New York Times with a new name:  FiveThirtyEight:  Nate Silver's Political Calculus.  He continued to publish regular updates through the 2010 mid-term elections, the primary season of the 2012 election, and the 2012 election itself.

Silver is an open supporter of Barack Obama and reportedly has even shared information with the Obama campaign, at least in the 2008 election cycle, but insists that his methodology is based on science rather than opinion.  He has been particularly critical of the methodology of the Rasmussen poll which has consistently favored Republican candidates.  His criticism is based on Rasmussen's methodology which tends to select and older, non-tech savvy population than polls which use more modern means of communication than just telephones.  In terms of results, Rasmussen has been spectacularly wrong in both 2008 and 2012, so Silver certainly has results to validate his criticisms.

One of the more interesting sideshows of the 2012 election came when former Republican congressman Joe Scarborough, now a commentator for MSNBC and host of a political talk show called Morning Joe, called out Silver who was confidently predicting a repeat win for Barack Obama saying it was impossible for anyone to call the election as anything more than a tossup.  Silver responded by offering to bet $1000 to be donated by the loser to the Red Cross that his forecast would prove to be correct.  Scarborough countered by offering a joint donation with Silver of $2000 to the Red Cross.  Silver responded by upping the ante to $2000.  The bet offer landed him in hot water with the Editor of the NY Times, the parent company for his blog.

As election day approached, Silver became increasingly confident of an Obama win and posted a 90.9% chance of of the President being re-elected.  In the end, Silver accurately predicted the winner in all 50 states(I believe Florida is still unofficial, but appears to be safely in the Obama column at this time), while Ramussen Reports missed 6 of 9 swing states.

Nate Silver appears to be a restless soul and has indicated that he does not expect to continue political analysis for the rest of his life, or even into the near future.  He has already delved into other areas such as quantifying the "Most Livable Neighborhoods in New York." He has also indicated some interest in returning to his first love, baseball.

Bloggers Note:  By way of disclosure, most of the details of for this post were found in Wikipedia, verified from my own memory of things I have read elsewhere.

I will accept comments that are limited to the science of politics, and of course, baseball.  I will delete comments on political opinion, policy and any and all rants or diatribes.  This is not a political opinion website.  My reason for posting this is to draw attention to the connection between the statistical analysis of baseball and politics.

Here's an interesting factoid I read today:  Mitt Romney got exactly the same percentage of the white vote in this election as George H W Bush got in the 1988 election when he beat Mike Dukakis in a landslide.

Factoid #2:  This is the first time the United States has had 3 consecutive 2-term Presidents since Jefferson, Madison and Monroe!

23 comments:

  1. This comment has been removed by a blog administrator.

    ReplyDelete
  2. This comment has been removed by a blog administrator.

    ReplyDelete
  3. Nothing bad, just incomprehensible. Not sure why you would go through the word puzzles just to post something like that.

    ReplyDelete
    Replies
    1. Must be young whippersnappers, my eyes are so bad at reading those puzzles that I regularly fail the puzzle (of course, that is also a function of me commenting a lot too...)

      Delete
  4. Thanks DrB, very interesting! I've been hearing about this tangentially, been curious but not enough to check out more, so I appreciate the history lesson here.

    Too bad PECOTA isn't as accurate as his political analysis. Then again, that would certainly change baseball tremendously if anyone was ever able to do that, I'm not sure what the implications of such a perfect predictor would be and whether that would be good or bad for the game, though my first thought is that it would be very bad because then you can just plug everything into a computer and spit out the results, taking the human element out of the game, so maybe that's my answer.

    Of course, the closest that methodology can reach is probably talent level for a player, as the holy grail. I don't think we can ever measure how a player will react to pressure, particularly the pressures of the playoffs, and how quickly they learn to ignore that pressure and play normally.

    ReplyDelete
    Replies
    1. In fairness, I should have pointed out that Silver is not infallible on the political side. He was quite spectacularly wrong in the Scott Brown Massachusetts Senate win in 2010.

      Delete
  5. I can't say that I was entirely thrilled by the accuracy of Silver's predictions, but am certainly interested in them. Thanks for the excellent post.

    ReplyDelete
  6. I am not sure about the similarities.

    Is he using statistics in his political predictions?

    ReplyDelete
    Replies
    1. Polling has always relied on statistics for its veracity, since polls only ask questions to a very small sample of the total population - think of the +/- number at the end of every poll which is meant to represent how far off the reported numbers may be because of the sample size. Silver did three things: he aggregated all polls to increase the total sample, he weighted poll results on historical bias and current divergence (if 9 polls said 51-49 and a 1oth said 55-45, he weighted that 10th one less in his total), and he broke down polls by states and tracked who was leading in each state and mapped a path/likelihood to 270 electoral votes needed to win the presidency. It was all statistical manipulation of other peoples numbers (i.e. polls) to develop weights, totals and probabilities. His "gut" only played a role in developing weights to test (and then either reject or accept on their statistical usefulness and predictive quality).

      Delete
    2. They seem like two different procedures.

      One is like based on your body temperature reading of the last two years, you are likely to have this body temperature now (without checking if you are conscious or not).

      The other one is like, let me take a reading every 3 seconds, from different parts of your body, and let me do it for 1 or 2 hours, and assign different weighs and I will tell you what your likely average body temperautre will be.

      Delete
    3. BLSL,

      I'm not sure how those two similies are all that different.

      Political polling is way more than just asking random people questions. You have to add in historical probabilities for turnout, liklihood of voting this time, demographics.

      I do have to admit that it appears that the biggest contribution Silver made to forcasting this election is calling BS on Rasmussen which, strangely, nobody else seemed willing to do. I follow a site called realcearpolitics.com, a site that leans right but links articles from the full political spectrum. They also post a of poll results and a poll average. I just used my own eyeball test to mentally eliminate Rasmussen from the average as a outlier and I had a pretty good idea of how the election was going to turn out. I was able to approximate with my eyes and brain what Silver did with his statistical calculations. In his defense, if you want to blog about it and be taken seriously, you have to do the math.

      Delete
    4. You're right that there are a lot of factors one should be consider in coming up with representative samples.

      The analogy would be to throw out some data points of a particular player, or in the case of team stats, data points of some players who may not be on the team any more at that particular moment in time under examination.

      Delete
    5. Recognizing outliers and things that are likely to regress vs trends that are likely to progress are major aspects of both sabermetrics and political prognosticating.

      Delete
    6. And therein lies the rub.

      At some point, we are back to the 'scouts' approach where human judgment takes precedent over cold, hard numbers.

      Delete
  7. The two statistical procedures seem to be very different, don't they? The political predictions rely on public opinion polling--with claims adjusted by Silver, of course--on a matter settled by the real-life working out of what the numbers that the pollsters provide are intended to predict. Silver is brilliant at this deriving a macrocosm from its supposed microcosm. My trust in him is such that at election time I dote on him. There is no such one-to-one relationship possible for PECOTA's baseball predictions. They're based on much more treacherous data, from past performance and a select set of comparable players; and one can't trust them, though they're better than having no information at all, save stray scraps and impressions and eyeballing old stats from fangraphs. That's not saying much. I remember objecting to Sabean's detractors after a lousy season a few years ago that every Giants starter save for one (who, I don't recall) had markedly underperformed his PECOTA projections--that is, Sabean got the bad results that year that he would equally have had as a devout PECOTA sabermetrician.

    ReplyDelete
  8. Big fan of Nate's PECOTA and 538. What he has helped do is bring a data and stats based analysis into a traditionally "eye test" world. Obama and the Giants this year were prime examples where the pundits/baseball experts got it wrong and by a lot. Both underdogs, both playing against the traditional big money clubs and both got it right by focusing on what is really proven statically to make the difference in the playoffs - pitching and defense, and elections - a 12-county, swing state GOTV ground game. No time for big egos, just a relentless and stats-honed focus.

    Both Detroit and Romney were left shell-shocked because they believed their own grandiose bullcrap of how good they were - and found out the hard way what stat-wise analysis buys you - reality and victory.

    ReplyDelete
    Replies
    1. Stat-wise analysis? Is that what we did? I thought we brought good ol hustle and small ball, and then muscled up when we needed the rope-a-dope. Plus those wonderful terms heart and grit that saber folks just love...

      I'm strictly baseball on the interwebz, no comments on the politico. But it is interesting stuff.

      Delete
    2. I think everything is stat-related. For example, crossing a street is an exercise in stat analysis - you weigh the odds of getting hit by a truck before you cross, if you are a reasonable person. That's why most people cross the street they live on but not interstate freeways.

      That means everyone does.

      In science, a theory that applies always is not falsifiable and therefore is not a good one. If everyone does stat analysis, it does not make sense to attribute the difference to that, which is to say, my guess is that the Giants do more than stat-wise analysis.

      Delete
    3. Well the point is what stats are you looking at? The ones that only reflect and reinforce your world view or do you attempt to make them as real and accurate as possible given all the variables?

      The Giants strategically choose, since 2006-7, to invest in pitching, defense, speed and now youth to build their nascent dynasty. Did stats play a part? Can't say for sure, and Sabes ain't toot'n that horn, but it sure seems like the Giants have a plan and have executed to it. Of course the players have to play and execute - but look at the results!

      Nate's work at 938 was heavily criticized by camp Romney and Fox pundits as a skewed and Demo leaning approach. To the point that they now concede camp Romney un-skewed Nate's results to internally suit their Romney is winning meme to raise more $$$.

      Point is you can be stats-wise or stats stupid, and reality bites cold and hard when you don't build your strategy looking at the right stats.

      Delete
    4. MS,

      Very well said. I agree completely.

      Delete
    5. Well said indeed, thought I don't agree completely.

      I will try to add to it.

      You're right when you say it depends on what stats you look at. When you ask if stats played a part, the answer is yes and no. It's yes as everyone looks at numbers like homeruns, batting averages, etc. So, yes, stats probably played a part. It's no, if they are some proprietary stat that belong to someone else. So, like you say, it depends on what you are looking at and, becuase of that, it could be yes or it could be no to the question if stats played a part.

      As to the question of stats-wise or stats-stupid, I will just say we should make a distinction btween stats-too-much and stats-appropriate, and say that that distinction is different from the the stats-wise/stats-stupid distinction.

      Delete
  9. Hey DrB where did that last post go with the updates? the blogspot go blogsplat?

    ReplyDelete
    Replies
    1. I must have forgotten to re-post it after I updated Ricky's big line from yesterday's AFL game. It's back now.

      Delete