Saturday, November 12, 2016

The Inexcusable Arrogance of The Pundits

Tuesday, Donald Trump defeated Hillary Cinton to become President-elect of the United States.  Trump celebrated the win late that night, Ms. Clinton conceded early Wednesday morning, but as the week ended the major pundits were largely unwilling to admit that they were wrong.   Excuses for blowing the call ranged from blaming inaccuracy on late voter decisions to complex explanations that – statistically – the pundits weren’t that far off.  

For example, Nate Silver (who boasted for four years how well he did in predicting state and national results in 2012),  presented a weak defense of his statistical model.

Silver also claimed that the results were within the standard margin-of-error, implying that he didn’t really get it wrong.

Silver gave Trump a 29% chance of winning early Tuesday night.  It’s important to keep in mind that Silver also limited Trump’s chances of winning to 12.6% back on October 18,

and that Silver’s forecast fluctuated as polls did; Silver locked his forecast into poll accuracy, even though he claimed to adjust for bias and outliers – he bluntly failed to consider the effect of groupthink.

Next up is the Huffington Post, which boldly predicted a 98% chance of a Clinton win, then blamed the loss on a “black swan event” (and Trump only a 2% chance),

which amounts to claiming no one could have seen it coming.   This would be a lie.

The New York Times gave Clinton an 85% chance of winning the day of the election, down a bit from 93% on October 25.   This equated to giving Trump a 15% chance, up from 7% on the respective dates.

Rather than candidly admit their bias and its results, the NYT actually blamed … the data itself.   Hypocrisy in print, folks.

Larry Sabato, who has made a nice living from predicting elections over the years, actually claiming a 99% success rate in 2004 and 97% in 2012.

Sabato called 347 Electoral Votes for Clinton this year, which cannot be sanely called anything but a faceplant.

Forbes, best-known for business reporting, also got into the election forecast game, and when they got it badly wrong they blamed ‘statistical error’.

And so it goes.    At this writing, exactly none of the people who made money and gained fame from predicting elections, had the guts to plainly admit they got this one completely wrong.

Why should we care?  Because a lot of media paid attention to these pundits all through the election, especially at the end.  They threw out predictions that were clearly way off the mark.  A lot of them have offered excuses, but let’s step back and see why the explanations are worthless.

Silver, for example, goes into great detail about different factors and how they influenced the election results. 

Some of that is interesting reading, but the sum effect is that it comes off as butt-covering, not least because any professional should have properly included such factors in their pre-election forecast.

So what should the forecast have looked like?  To answer that, we need to step back and ask what we expect from a forecast.  A forecast should have general similarity to what actually happens.  For example, in a weather forecast we often hear about, say, a ‘30% chance of rain’.  That’s actually a little vague, since it doesn’t tell us where that rain will happen or when, but if we hear 30%, we would expect some clouds and only in some places.  A completely clear, sunny day or a torrential downpour would mean the forecast was wrong, no matter what explanation the weather guy offered. So the election results can be seen this way:

In a straight look at the Popular Vote, Hillary Clinton claimed 47.8% to Trump’s 47.3%.   Of course, the actual election does not depend on the Popular Vote, but this result is consistent with a national picture, and the main point is that none of the major pundits gave Trump a 47.3% chance.  By this metric, the major polls grade out this way in their calls:

FOX News: Called 44% for Trump (-3.3%), called 48% for Clinton (+0.2%), aggregate (-3.5%)
LA Times:  Called 47% for Trump, (-0.3%), called 44% for Clinton (-3.8%), aggregate (-4.1%)
ABC/WaPo: Called 43% for Trump (-4.3%), called 47% for Clinton (-0.8%), aggregate (-5.1%)
IBD/TIPP:  Called 45% for Trump, (-2.3%), called 43% for Clinton (-4.8%), aggregate (-7.1%)
CBS News: Called 41% for Trump (-6.3%), called 45% for Clinton (-2.8%) aggregate (-9.1%)
Bloomberg: Called 41% for Trump (-6.3%), called 44% for Clinton (-3.8%), aggregate (-10.1%)

Pretty much everybody was outside a statistical margin of error (Fox was almost inside that line). No one can claim to have nailed that call, but each poll got close-ish on at least one candidate.  Grade them C’s and D’s at a professional standard.

But Presidential elections depend on wining electoral votes from state contests.  In the end, Trump won 306 electoral votes to Clinton’s 232 electoral votes, or 56.9% of the EV to 43.1%.  No one at all came close to predicting Trump would nearly 57 percent of the EV.  Absolutely none of the pundits listed above were anywhere close to being right.   If these were students, we’d be comparing different levels of ‘F’ grades on an exam.

Again using Real Clear Politics’ published results,

we can see the average results of each state by vote for each candidate; the average should give us a reasonable forecast for a candidate winning election.  Using the vote results by state, Trump claimed an average 48.9% of the vote to Clinton’s 45.2%.  Again, none of the pundits came close to this result.

Pundits will sometimes point to variables, margin of error, and other technicalities to excuse blowing the call. But never forget that the main reason for any forecast is to give you a reasonable expectation of what is coming.  It’s fair (but very rare) for a statistician to admit that he cannot forecast a clear outcome; pay attention here to the fact that both Gallup and Pew refused to publish election predictions this year.  But if a pundit publishes a forecast that projects a clear winner by a wide margin, as Silver, Huffington, the New York Times, Sabato and so on all did, they cannot pretend that they did anything but fail when results are so plainly different from their predictions.  Aggregation is a poor tool in election forecasting, and sooner or later the public should demand better work from people who are happy to take credit and publicity for their projections.

Man up, you wimps.  You blew it.         

Thursday, November 10, 2016

Pollsters Ignored Their “Check Assumptions” Lights

Back in 2000 and again in 2004, I enjoyed a small piece of influence through political opinion poll analysis.  Statistics is an intriguing science, all the more because it tries to quantify and predict human behavior.  But that same human behavior also skews how people think, including analysts, and in 2008 and 2012 it caused me to miss important trends in American politics.  I was embarrassingly wrong in predicting the Presidential elections, especially missing the energy of Obama’s 2008 run.  So I backed off, paid more attention to my regular job and family, and paid less attention to statistics.  Others enjoyed the attention of poll mavens, especially Nate Silver, who turned his statistical devotion to baseball into political success with Obama’s success.  But Silver made the same mistake I did, and in his case the embarrassment is greater because as a professional statistician, he really ought to have known better.  Silver let his enthusiasm for Democrat opinion cause him to ignore warning signs until it was too late to avoid a face plant.

Let’s have a quick review of how polls saw the 2016 Presidential Election, and also how polls work, and finally how predictive analysis is created. 

Hillary Clinton announced her decision to run for the White House on April 12, 2015.  This is important because Clinton already enjoyed significant name recognition and with the roles of First Lady, Senator and Secretary of State on her resume, she would start as an obvious front-runner for the Democrats’ nomination.  Nate Silver gave her a 59.9% chance of winning the party nomination at the beginning (I’m using Silver here for two reasons – first, his projections are built from aggregates of major national polls, and second, Silver was the most prominent poll analyst quoted in the media).  She enjoyed media support through the end of 2015 as the presumptive front-runner, but by the end of October 2015 Clinton’s lead over Sanders in Silver’s chart was down to 46.8% to 26.1%, notable not for Sanders’ strength but Hillary’s weakness.  By February 2016, Silver put the race at 49.6% Clinton to 39.1% Sanders – note that Hillary’s campaign was failing to win over most of the undecideds, losing them to Sanders more than four to one.  By April 23, 2016 Silver had the race 49.6% Clinton to 41.5% Sanders; note two important factors apparent, first that Hillary appeared to have a lead bigger than Sanders could close, but second that Sanders had more momentum than Clinton, and had enjoyed higher energy for some months.  By the end of June, Silver showed the race 55.4% Clinton to 36.5% Sanders, essentially a done deal for the Democratic Party nomination.

Donald Trump announced his candidacy for the office of the President on June 16, 2015.  At that time Silver counted his support at a 3.6% chance of winning the GOP nomination.  Let’s stop there and consider that this meant the polls showed Hillary Clinton’s chances of winning her party’s nomination were more than sixteen times greater than Donald Trump’s chances of winning his party’s nomination.  Part of this was due to the heavy number of candidates for the Republican nod, but also Donald Trump – while known as a face and name – was unknown as a political contender, so he had to establish his bonafides with both the GOP and the voters.   Trump’s campaign quickly gained support, however, as he passed the 20% threshold on July 26, 2015, and the 40% threshold on March 21, 2016.  This means that Donald Trump had not won over most voters until after his Super Tuesday wins in Alabama, Arkansas, Georgia, Massachusetts, Tennessee, Virginia and Vermont.  On March 22, Trump claimed another 58 delegates by winning the Arizona primary.  By the end of May, Trump had essentially locked up the GOP nomination.

Both Clinton and Trump finished the win-the-nomination part of their campaigns with damage, however.  Trump’s problems were obvious – to energize his base, Trump attacked establishment Republicans and demographics aligned with opponents of populist theory, and this cost him nationally in polls. In early June, polls showed Trump’s support at 38.1%, compared to 42.1% for Clinton.  But Clinton had obvious problems, too.  The way Clinton won the Democrats’ nomination left many Sanders supporters convinced the primary had been rigged, which may be one reason Trump made similar claims as the General Election reached its resolution.  But also, given the many demographic groups Trump had – allegedly – attacked, a four-point lead for Clinton was a clear warning sign that something was not as described. 

Call it a poll version of that annoying “check engine” light on your dashboard.  Until you have someone get under the hood, you don’t know what exactly has gone wrong, but you can’t ignore it unless you don’t mind spending hours on the side of the road beside your smoking vehicle, at the mercy of passing traffic.  There is science behind a poll that is put together and analyzed properly, but laziness or assumptions in your data or procedures can invalidate your conclusions, and make you look a fool in public. 

By the way, Nate Silver uses an aggregate of polls, but he is also guilty of some subjectivity in his source selection.  For example, Silver’s aggregate shows Clinton had a wire-to-wire lead over Trump in polling, with Trump never enjoying a lead in the aggregate polling at any time:

 Real Clear Politics, however, which also uses an aggregate of polls, showed Donald Trump with an aggregate lead on May 24 and from July 25 through July 28 of this year.

That’s not to say one aggregate is ‘better’ than the other, but to illustrate the fact that any aggregate is subjective and contains implicit bias. Ironically, Silver was aware of this bias and tried to correct for it – he calls this “trend line adjustment” – but in the end Silver’s own bias still influenced his conclusions.

It’s important to remember that Silver was wrong about Trump winning the GOP nomination.  After trump won the GOP nomination, Silver admitted “we basically got the Republican race wrong.”

There was no evidence that Silver went back to find the evidence he overlooked in his initial analyses, which could have corrected his results in the General Campaign.  But here is, at least, evidence that Silver knew something in the numbers was wrong.  Just before the final day of the election, Silver put out his “final election update”, giving Clinton a 71% chance of winning.

This ran contrary to far more aggressive posts from the New York Times, which gave Clinton an 82% probability of winning,

the Princeton Election Consortium gave Clinton a 93% chance to win the White House,

left-leaning pundit Larry Sabato did not offer a probability, but called for Clinton to win 347 Electoral Votes,

and of course the Huffington Post posted that Clinton had a 98% chance to win the Oval Office.

Anyone who turned on ABC, NBC, CBS, CNN, or Fox was also flooded with assurances that Clinton was poised to win by large margins.   That all of these analysts were wrong, and to such a large degree, is amusing given their hubris, but concerning given their prominence in media coverage of the election.

The last week of the election, Nate Silver’s concerns about the polling data caused him to scale back his probability for Clinton (he initially had Clinton at 89%, but as the election approached he walked it back to 71%), while Ryan Grim of the Huffington Post kept Clinton at a 98% chance to win. This led to some ill-advised words on Twitter between the two men about each other’s methodology.

Ironically, while Silver was correct that weighting Clinton’s advantage beyond anything supported by poll data was foolish, he failed to properly test the underlying assumptions installed in his own model.

I found it intriguing to notice that neither Gallup nor Pew published polls for the Presidential election, each focusing instead on issues rather than candidates.  A business reason was provided,

but given the long history and prominence Gallup and Pew enjoyed in  polling Presidential races, the reason given rings false.  A more likely explanation is the difficulty in addressing behavior changes in the voting public.  In addition to the shift from landline phones to cell phones, voters are more likely to discuss opinions on line than in a phone interview, but there is no statistically sound means to randomly contact respondents online and the results of online polls are as varied as there are opinions reported by them.  Pew observed that online polls are “non-probability” polls, which eliminates by definition the random nature of polls, and therefore calls into question any political conclusion presented by such a poll.

Pew also posted an article yesterday about why the polls were essentially wrong, but was wrong to pretend weighting mistakes were not a big part of blunder.

Forbes boasted that analysts predicting a Hillary win “used the most advanced aggregating and analytical modeling techniques available”

but that is a false claim on its face.  What happened was not a “statistical error”, but human error.  Weighting for party affiliation or other demographics, is risky at best and often leads to unreliable results.  To see what I mean, let’s start with the exit poll from the 2012 Presidential Election, by party affiliation, gender, race, and age:

Party Affiliation: Democrats 38%, Republicans 32%, Independents 29%

Gender: Women 53%, Men 47%

Race: White 72%, African American 13%, Hispanic 10%, Asian 3%, Other 2%

Age: 45-64 38%, 30-44 27%, 18-29 19%, 65 & over 16%

And from 1984 through 2014:

Party Affiliation: Democrats 38.6%, Republicans 32.6%, Independents 27.5%

Gender: Women 53%, Men 47%

Race: White 76%, African American 13%, Hispanic 7%, Asian 2%, Other 1%

Age: 45-64 33%, 30-44 28%, 18-29 14%, 65 & over 25%

Any poll with demographics different from these numbers is fiddling with the numbers out of clear bias.  Without wasting time going through them this skewing invalidates polls from ABC News, the Wall Street Journal, Fox News, NBC News, CNN, and CBS.  If you want to check for yourself, simply find one of their polls and drill down to the demographics which are usually included at the end of the topline detail.

Weighting is not supposed to produce the “right” answer, but to line information up according to known population demographics.  Sadly, a lot of polls screw up the results by trying to sell a message, rather than accurately report the current situation.  This is not an attempt to “rig” an election, I believe, but simple human laziness and a habit of using assumptions instead of due diligence.

This becomes ever more salient, when you realize that the aggregates used by analysts like Silver and Grim incorporate these biased reports, which invalidates their own analyses.  Aggregation is really just group-think, even if some people publish such results with impressive names like “meta-sampling”.  Everything that goes into an analysis should be tested for its own veracity, and while this is very difficult for a national report, at the very least you should be candid if you are trusting someone else’s report as a source for your own analysis.   Yes, Silver claims he ‘unskews’ polls by other agencies, but that’s kind of like a guy admitting someone spit into your drink but he scooped it out and it’s fine for you to drink.  If you know the source is biased, it does not belong in your own work, none of it.

One last thought on polling.  The Presidential Election is not a national race, no matter what the media tells you.  It’s actually fifty-one different races, which results are summed up and produce the champion, in this case the President-Elect of the United States.  So the polls you ought to have watched are the state polls, especially according to the respective electoral vote value of each state.   Most media ignored the state-level polling, and when it was reported it was usually just from a single source that the media found reliable.  I will be publishing a report on the accuracy of the state polls for the 2016 Election when I have all the data, but for now it’s important to know the limits of what analysts even can tell you, and keep in mind that most media people are there to sell you entertainment, not facts.

Sunday, November 06, 2016

A Plea For Americans to Be Americans

The TV show “60 Minutes” ran an interesting but depressing feature this week, discussing how it has become impossible for people with different political views to reasonably discuss the election.  Whether Democrat or Republican, Progressive, Conservative, Libertarian or something else, whether men or women, whether old or young, whether urban or suburban or rural, from any place in the nation, no matter what race, no matter what education or environment, everyone is upset with the content and tone of the election.  What is especially alarming, is that people almost universally talk over each other, refuse to listen to different opinions, and quickly resort to personal insults and attacks.

This is not just about Donald Trump or Hillary Clinton.  Some of this, I fear, goes back much further, to a point I won’t identify because when it started is irrelevant except that it’s real, and it threatens all we hold good in our nation.

Look, in 2008 and 2012 Barack Obama won election and re-election, but more than forty percent of voters wanted the other guy.  In fact, not since Nixon won re-election in 1972 have voters given sixty percent or more to a Presidential candidate (and we have since learned a lot about how Nixon got his numbers, enough to call the results into questions).  But the new President pretty much always acts as if he won total support by right, to the point that the losing party is treated as if they don’t represent many millions of Americans.   Both Trump and Clinton have made it clear they intend to do just that, should they win.   In addition to everything else we have suffered as a nation this election season, Americans now feel as if the stakes are poor-or-nothing; put up with an unethical candidate they back, simply because the alternative is abominable. 

Later this week, we’ll know the results of the election (maybe even by end of night, Tuesday).  But the fate of the nation depends on us, whether we hold the new President accountable, and whether we hold ourselves accountable to the ideals put in place for us by so many citizens before us.

Win or lose, let us be Americans, in spirit and practice.