Stolen Thunder: Pollsters Ignored Their “Check Assumptions” Lights

Back in 2000 and again in 2004, I enjoyed a small piece of influence through political opinion poll analysis. Statistics is an intriguing science, all the more because it tries to quantify and predict human behavior. But that same human behavior also skews how people think, including analysts, and in 2008 and 2012 it caused me to miss important trends in American politics. I was embarrassingly wrong in predicting the Presidential elections, especially missing the energy of Obama’s 2008 run. So I backed off, paid more attention to my regular job and family, and paid less attention to statistics. Others enjoyed the attention of poll mavens, especially Nate Silver, who turned his statistical devotion to baseball into political success with Obama’s success. But Silver made the same mistake I did, and in his case the embarrassment is greater because as a professional statistician, he really ought to have known better. Silver let his enthusiasm for Democrat opinion cause him to ignore warning signs until it was too late to avoid a face plant.

Let’s have a quick review of how polls saw the 2016 Presidential Election, and also how polls work, and finally how predictive analysis is created.

Hillary Clinton announced her decision to run for the White House on April 12, 2015. This is important because Clinton already enjoyed significant name recognition and with the roles of First Lady, Senator and Secretary of State on her resume, she would start as an obvious front-runner for the Democrats’ nomination. Nate Silver gave her a 59.9% chance of winning the party nomination at the beginning (I’m using Silver here for two reasons – first, his projections are built from aggregates of major national polls, and second, Silver was the most prominent poll analyst quoted in the media). She enjoyed media support through the end of 2015 as the presumptive front-runner, but by the end of October 2015 Clinton’s lead over Sanders in Silver’s chart was down to 46.8% to 26.1%, notable not for Sanders’ strength but Hillary’s weakness. By February 2016, Silver put the race at 49.6% Clinton to 39.1% Sanders – note that Hillary’s campaign was failing to win over most of the undecideds, losing them to Sanders more than four to one. By April 23, 2016 Silver had the race 49.6% Clinton to 41.5% Sanders; note two important factors apparent, first that Hillary appeared to have a lead bigger than Sanders could close, but second that Sanders had more momentum than Clinton, and had enjoyed higher energy for some months. By the end of June, Silver showed the race 55.4% Clinton to 36.5% Sanders, essentially a done deal for the Democratic Party nomination.

http://projects.fivethirtyeight.com/election-2016/national-primary-polls/democratic/

Donald Trump announced his candidacy for the office of the President on June 16, 2015. At that time Silver counted his support at a 3.6% chance of winning the GOP nomination. Let’s stop there and consider that this meant the polls showed Hillary Clinton’s chances of winning her party’s nomination were more than sixteen times greater than Donald Trump’s chances of winning his party’s nomination. Part of this was due to the heavy number of candidates for the Republican nod, but also Donald Trump – while known as a face and name – was unknown as a political contender, so he had to establish his bonafides with both the GOP and the voters. Trump’s campaign quickly gained support, however, as he passed the 20% threshold on July 26, 2015, and the 40% threshold on March 21, 2016. This means that Donald Trump had not won over most voters until after his Super Tuesday wins in Alabama, Arkansas, Georgia, Massachusetts, Tennessee, Virginia and Vermont. On March 22, Trump claimed another 58 delegates by winning the Arizona primary. By the end of May, Trump had essentially locked up the GOP nomination.

http://projects.fivethirtyeight.com/election-2016/national-primary-polls/republican/

Both Clinton and Trump finished the win-the-nomination part of their campaigns with damage, however. Trump’s problems were obvious – to energize his base, Trump attacked establishment Republicans and demographics aligned with opponents of populist theory, and this cost him nationally in polls. In early June, polls showed Trump’s support at 38.1%, compared to 42.1% for Clinton. But Clinton had obvious problems, too. The way Clinton won the Democrats’ nomination left many Sanders supporters convinced the primary had been rigged, which may be one reason Trump made similar claims as the General Election reached its resolution. But also, given the many demographic groups Trump had – allegedly – attacked, a four-point lead for Clinton was a clear warning sign that something was not as described.

Call it a poll version of that annoying “check engine” light on your dashboard. Until you have someone get under the hood, you don’t know what exactly has gone wrong, but you can’t ignore it unless you don’t mind spending hours on the side of the road beside your smoking vehicle, at the mercy of passing traffic. There is science behind a poll that is put together and analyzed properly, but laziness or assumptions in your data or procedures can invalidate your conclusions, and make you look a fool in public.

By the way, Nate Silver uses an aggregate of polls, but he is also guilty of some subjectivity in his source selection. For example, Silver’s aggregate shows Clinton had a wire-to-wire lead over Trump in polling, with Trump never enjoying a lead in the aggregate polling at any time:

http://projects.fivethirtyeight.com/2016-election-forecast/national-polls/

Real Clear Politics, however, which also uses an aggregate of polls, showed Donald Trump with an aggregate lead on May 24 and from July 25 through July 28 of this year.

http://www.realclearpolitics.com/epolls/2016/president/us/general_election_trump_vs_clinton-5491.html

That’s not to say one aggregate is ‘better’ than the other, but to illustrate the fact that any aggregate is subjective and contains implicit bias. Ironically, Silver was aware of this bias and tried to correct for it – he calls this “trend line adjustment” – but in the end Silver’s own bias still influenced his conclusions.

http://www.huffingtonpost.com/entry/nate-silver-election-forecast_us_581e1c33e4b0d9ce6fbc6f7f

It’s important to remember that Silver was wrong about Trump winning the GOP nomination. After trump won the GOP nomination, Silver admitted “we basically got the Republican race wrong.”

http://fivethirtyeight.com/features/why-republican-voters-decided-on-trump/

There was no evidence that Silver went back to find the evidence he overlooked in his initial analyses, which could have corrected his results in the General Campaign. But here is, at least, evidence that Silver knew something in the numbers was wrong. Just before the final day of the election, Silver put out his “final election update”, giving Clinton a 71% chance of winning.

http://fivethirtyeight.com/features/final-election-update-theres-a-wide-range-of-outcomes-and-most-of-them-come-up-clinton/?ex_cid=2016-forecast

This ran contrary to far more aggressive posts from the New York Times, which gave Clinton an 82% probability of winning,

http://www.nytimes.com/elections/forecast/president

the Princeton Election Consortium gave Clinton a 93% chance to win the White House,

http://election.princeton.edu/2016/11/08/final-mode-projections-clinton-323-ev-51-di-senate-seats-gop-house/

left-leaning pundit Larry Sabato did not offer a probability, but called for Clinton to win 347 Electoral Votes,

http://ijr.com/2016/08/667335-famed-election-predictor-with-97-100-track-record-reveals-his-trump-vs-hillary-2016-results/

and of course the Huffington Post posted that Clinton had a 98% chance to win the Oval Office.

http://elections.huffingtonpost.com/2016/forecast/president

Anyone who turned on ABC, NBC, CBS, CNN, or Fox was also flooded with assurances that Clinton was poised to win by large margins. That all of these analysts were wrong, and to such a large degree, is amusing given their hubris, but concerning given their prominence in media coverage of the election.

The last week of the election, Nate Silver’s concerns about the polling data caused him to scale back his probability for Clinton (he initially had Clinton at 89%, but as the election approached he walked it back to 71%), while Ryan Grim of the Huffington Post kept Clinton at a 98% chance to win. This led to some ill-advised words on Twitter between the two men about each other’s methodology.

http://www.vox.com/2016/11/6/13542328/nate-silver-huffpo-polls

Ironically, while Silver was correct that weighting Clinton’s advantage beyond anything supported by poll data was foolish, he failed to properly test the underlying assumptions installed in his own model.

I found it intriguing to notice that neither Gallup nor Pew published polls for the Presidential election, each focusing instead on issues rather than candidates. A business reason was provided,

http://time.com/4067019/gallup-horse-race-polling/

but given the long history and prominence Gallup and Pew enjoyed in polling Presidential races, the reason given rings false. A more likely explanation is the difficulty in addressing behavior changes in the voting public. In addition to the shift from landline phones to cell phones, voters are more likely to discuss opinions on line than in a phone interview, but there is no statistically sound means to randomly contact respondents online and the results of online polls are as varied as there are opinions reported by them. Pew observed that online polls are “non-probability” polls, which eliminates by definition the random nature of polls, and therefore calls into question any political conclusion presented by such a poll.

http://www.pewresearch.org/fact-tank/2014/07/28/qa-what-the-new-york-times-polling-decision-means/

Pew also posted an article yesterday about why the polls were essentially wrong, but was wrong to pretend weighting mistakes were not a big part of blunder.

http://www.pewresearch.org/fact-tank/2016/11/09/why-2016-election-polls-missed-their-mark/

Forbes boasted that analysts predicting a Hillary win “used the most advanced aggregating and analytical modeling techniques available”

http://www.forbes.com/sites/startswithabang/2016/11/09/the-science-of-error-how-polling-botched-the-2016-election/#4d6c04257da8

but that is a false claim on its face. What happened was not a “statistical error”, but human error. Weighting for party affiliation or other demographics, is risky at best and often leads to unreliable results. To see what I mean, let’s start with the exit poll from the 2012 Presidential Election, by party affiliation, gender, race, and age:

Party Affiliation: Democrats 38%, Republicans 32%, Independents 29%

Gender: Women 53%, Men 47%

Race: White 72%, African American 13%, Hispanic 10%, Asian 3%, Other 2%

Age: 45-64 38%, 30-44 27%, 18-29 19%, 65 & over 16%

http://ropercenter.cornell.edu/polls/us-elections/how-groups-voted/how-groups-voted-2012/

And from 1984 through 2014:

Party Affiliation: Democrats 38.6%, Republicans 32.6%, Independents 27.5%

Gender: Women 53%, Men 47%

Race: White 76%, African American 13%, Hispanic 7%, Asian 2%, Other 1%

Age: 45-64 33%, 30-44 28%, 18-29 14%, 65 & over 25%

http://www.electproject.org/home/voter-turnout/demographics

http://ropercenter.cornell.edu/polls/us-elections/how-groups-voted/

Any poll with demographics different from these numbers is fiddling with the numbers out of clear bias. Without wasting time going through them this skewing invalidates polls from ABC News, the Wall Street Journal, Fox News, NBC News, CNN, and CBS. If you want to check for yourself, simply find one of their polls and drill down to the demographics which are usually included at the end of the topline detail.

Weighting is not supposed to produce the “right” answer, but to line information up according to known population demographics. Sadly, a lot of polls screw up the results by trying to sell a message, rather than accurately report the current situation. This is not an attempt to “rig” an election, I believe, but simple human laziness and a habit of using assumptions instead of due diligence.

This becomes ever more salient, when you realize that the aggregates used by analysts like Silver and Grim incorporate these biased reports, which invalidates their own analyses. Aggregation is really just group-think, even if some people publish such results with impressive names like “meta-sampling”. Everything that goes into an analysis should be tested for its own veracity, and while this is very difficult for a national report, at the very least you should be candid if you are trusting someone else’s report as a source for your own analysis. Yes, Silver claims he ‘unskews’ polls by other agencies, but that’s kind of like a guy admitting someone spit into your drink but he scooped it out and it’s fine for you to drink. If you know the source is biased, it does not belong in your own work, none of it.

One last thought on polling. The Presidential Election is not a national race, no matter what the media tells you. It’s actually fifty-one different races, which results are summed up and produce the champion, in this case the President-Elect of the United States. So the polls you ought to have watched are the state polls, especially according to the respective electoral vote value of each state. Most media ignored the state-level polling, and when it was reported it was usually just from a single source that the media found reliable. I will be publishing a report on the accuracy of the state polls for the 2016 Election when I have all the data, but for now it’s important to know the limits of what analysts even can tell you, and keep in mind that most media people are there to sell you entertainment, not facts.

Stolen Thunder

Thursday, November 10, 2016

Pollsters Ignored Their “Check Assumptions” Lights

4 comments:

Meter

Blog Archive

The ST Blogroll (great with Coffee!)

About Me