Friday, October 31, 2008

Things That Make Polls Go D’Oh

It should be obvious by now that I will never get a job offer from Gallup, Rasmussen, or Survey USA. I’ve been pretty hard on them regarding the way they’ve weighted their party affiliation demographics, and I have repeatedly pointed out that ALL of the major polls are failing to comply with NCPP standards for disclosure and transparent practices. Frankly, I once held polling groups in much higher respect than I can do right now. And besides reporting what the invalid polls mean for this election, I also feel compelled to warn readers that opinion polling in general has lost its ethical core. I hope it will return to its commitment to accuracy and honest reporting, but for now polling seems to have gone the way of responsible mainstream journalism.

Liberal critics of my articles, and those who still trust the polling groups because of past work which was accurate and appeared trustworthy, have asked a very legitimate question: What if I am wrong? Isn’t it possible that I just cannot accept that Obama is going to win this election, and I am grasping at straws for moral support? I would consider answering that they could be right and I could be wrong, but even then I’d have to start by asking for clarification on exactly what they mean to ask.

Do they mean the Associated Press/Gfk poll which says Obama will win by one, or the Pew Research poll which says Obama will win by fourteen?

Do they mean the Battleground poll which says Obama will win by three, or the CBS/NYT poll which says Obama will win by thirteen?

You get the idea. The polls simply do not agree with each other. And yes, those margins are significant evidence of invalidity. I read a professor’s blog earlier this week, who is assuming that since all the polls say Obama is going to win, then they really do agree with each other and the margins do not matter. He contends that the polls which show a close race are really just the low end of the range, the wide lead polls are the upper end, and the average is really how things are going now. These assumptions, however, are invalid because the confidence level tests show the polls do not agree closely enough to avoid evidence of collinearity, and if collinearity exists then the results of the poll cannot be accepted, regardless of whether they appear believable or not.

Also, each poll has its own margin of error, usually around three percent, which is to say that Obama and McCain could each be as much as three points lower or greater in support than the poll shows. As a result, any poll which shows less than a six point lead for Obama is, statistically, saying that McCain could possibly be winning. Whether or not McCain is shown to be in the lead is not statistically relevant, except that we can say the polls do not indicate a McCain lead outside the MOE. However, even then we have to be careful to note that because of the invalid range of poll results, no valid conclusions can be made at all. None.

We also need to observe what’s been going on with the poll trends. In the last ten days, for example, Rasmussen has shown swings of up to 5 points, or a half-point per day. He’s saying that more than a half-million people on average are changing their minds every day. Does this sound reasonable to you?

The latest Fox poll shows McCain closing six points in just a week. That’s 7.8 million voters changing their minds in that time. Has McCain’s campaign done anything different that would explain that shift to you? And if not, why is the poll changing so drastically now that the race is coming to an end?

Gallup is still admitting they are clueless, as they continue to publish three separate models of voter opinion. You really should ask yourself, if Gallup was on top of things this year, why did they trash the original model in favor of one using unprecedented demographic assumptions, then use that same data to backtrack and try to reflect a “traditional” model? What did they see that made it clear they were wrong? And having been wrong not once but twice in fundamental operations this year, why should you assume they got lucky on the third guess, which in any case is built on the same methodological decisions they have tacitly admitted were wrong before?

The first rule the NCPP says any journalist should ask about a poll, is who is paying for it. With that in mind, shouldn’t you be skeptical that the polls reporting the largest leads for Obama are sponsored by agencies known to be pro-Obama and anti-McCain, specifically CBS News, the New York Times, ABC News, the Washington Post, and Newsweek? And shouldn’t you wonder if the community of pollsters just might be letting itself be influenced by Obama’s big-dollar media machine? Half a billion dollars of media publicity is bound to have an effect, and why wouldn’t it affect people who run the polling groups? People like Zogby, who called the 2004 election for Kerry months before the actual voting? People like Scott Rasmussen, who is getting serious coin to sell the story of this election by subscription? One area where I can tell you I am clearly more worthy of your trust, is that no one is paying me anything for what I do on the blogs. Not a penny. So, while I’d like to be rich someday, it doesn’t look like I’m going to get there by blogging on polls, but that means that you will be getting my honest opinion, based on my reasoning and the evidence, not on what effect it will have on my bank account. Sorry, but a pollster who refuses to show internal data to the public is a mercenary, not a professional, and a pollster who lets any media outfit decide what questions will be asked, what order they will be in, and which respondents are appropriate and how/when they will be contacted, is a media whore and his analysis is inherently dishonest.

OK, that’s pretty harsh, and I want to emphasize that many polls are indeed trying to be professional and accurate, as much as the business will let them be. And even in the media whore groups, there are individuals who are honest and honorable (and probably miserable) and trying to put out a solid product. The problem comes from two directions. First, polling has become a business more than a profession, meaning that the guys directing the polls have become too willing to sell a story, even if that story is not exactly true. This becomes apparent when polls report shifts which are not caused by valid events, most easily seen in the phenomenon of convention ‘bounces’. It’s one thing to expect a party’s base to become energized when the nominee is finally known and he comes out formally in a way that shows confidence and capability, but in recent years the pollsters have also decided this somehow affects the opposing party’s support levels, a patently absurd notion on its face. I mean, what did Obama do at his convention that is supposed to have won over some Republicans, and just why should we believe that a number of Democrats, even briefly, supported McCain because he chose Sarah Palin for his running mate? That’s manipulation of the data, folks, and cannot be explained any other way. It’s been going one a while, that roller-coasting of the numbers, since polls in the media need to keep attention, and to do that they need to be exciting, even if it means being dishonest. They get away with it because they have a lot of time to worry about closing in on accuracy in the late weeks. Of course, some years they blow that, too. It needs to be said, repeated and repeated again, that polls blow the call by more than their published margin of error about 40% of the time.

The other problem is the Obama Machine. There are a lot of unprecedented conditions in this election, and I do not think the polling groups ever really sat down and thought about what the new conditions would be. Well, actually they did, but they did not test their conclusions, and as a result bought into some pretty tall tales from the Obama people. This year, the polls assumed the following things would be very different about this year:

1. Barack Obama being the first black to receive a major party nomination for President, black voters would be greatly motivated to register and vote, and this would swing decisively towards Obama. This led some polls to over-sample black voters, in the expectation that their influence would be more significant this year.

It’s true and false. Black voters have indeed become more motivated this year, but as a demographic group blacks have always been enthusiastic, and have always overwhelmingly supported the democrat’s nominee in presidential elections. As a result, it is mathematically impossible for black voters to significantly change the outcome of the election by supporting Obama. In a tight race, the increased participation could make the difference in some states, but nationally the effect is minimal and polling models should not be changed because of it.

2. Barack Obama would greatly inspire and motivate young voters to register and vote, and this demographic would swing decisively towards Obama. This led some polls to over-sample young voters and to count more newly-registered voters as likely voters.

This one has been difficult to prove, since only the actual election can confirm or disprove the theory. However, John Kerry saw a strong rise in democratic party registrations in 2004, in part due to the primary efforts of Howard Dean. This created an apparently significant advantage for the fall campaign, which was one of the reason that Zogby called the election for Kerry early in the summer. In the actual election, however, under-30 voters’ proportion of the vote did not change from the 2000 election, and many of the newly registered voters simply did not vote, which is also consistent with historical behavior. Accordingly, it is not reasonable to alter polling models to behave in a manner inconsistent with historical norms.

3. The combination of excitement over Obama’s campaign, coupled with the nation’s dissatisfaction with President Bush and the Economy would lead to a great increase in democrats’ participation relative to republicans, as more people would see themselves as democrats and republicans would be likely to stay home. This led almost all polls to report results which either left democrat-heavy respondent pools unweighted, or which weighted polls to reflect heavy democrat advantages.

As with rumor 2, this cannot really be confirmed or disproven until the election is finished. However, history indicates the rumor is unfounded. In 1976, the republicans were expected to be dis-spirited, Richard Nixon having resigned in disgrace just two years previously. This was one reason that just after the party conventions, Governor Carter of Georgia led Ford by 33 points, a blow out seemingly undeniable. Yet in the actual election, Carter won by only two percentage points, and some political experts believe that if the election had been held a week to ten days later, Ford would have won. Part of the reason was that republicans in 1976 did show up to vote, less than the democrats but in far greater numbers than pollsters had expected to show. The same thing happened in 1948, when democrats were supposed to have given up, yet the record shows something far different. If a poll’s model is based on known history rather than pure speculation, then that model should not deviate from historical norms.

In my opinion, the polling groups allowed themselves to believe unfounded myths in all three of the cases I just mentioned. But they also failed to consider the influence of the half-billion dollars being spent by the Obama campaign, the rock-star behavior of his cadre (and a comparable level of professional knowledge and interest in middle America) in influencing and intimidating the media and public image (‘vote for Obama or you’re a racist’), and the heavily-urbanized character of his campaign and publicity efforts. The polling groups failed to note the dichotomy between the tone of Obama’s early primary victories and the voter response as the campaign wore on, failed to adjust their weighting to reflect actual results from primary elections and track with historical norms in each state and nationally. A massive effort by the Obama campaign to cast this election as unprecedented resulted in every major polling group abandoning historical models to create unproven models based on assumptions. What we are seeing now is the result of these models failing as key assumptions fail.

6 comments:

Anonymous said...

Drudge reports that McCain has a one point advantage over Obama -- 48 to 47 -- in a one day poll taken today in his tracking poll. This news challenges the mythology of the Obama "invincible" juggernaut, and may bring out enthusiastic McCain voters in droves. I realize Zogby's limitations, but the psychological impact of this could be enormous!

Anonymous said...

I was polled on the Presidential election by PEW two nights ago. They started by offering me $10 to participate. It struck me that they they must be wildly oversampling the kind of people who would be lured by $10 to waste 10 minutes of their evening - e.g students, the poor. Most people in my demographic (upper middle class married, kids, busy with work activities etc) would NOT waste their time with their family for ten bucks unless they have a strong interest in politics (like me).

Anonymous said...

Something really stinks with today's Gallop poll (Sat, Nov 1). Both the Traditional and Expanded likely voter models are even. What is more, while they interviewed 2847 registered voters, they say they ended up with 2516 interviews that qualified as traditional likely voters, and only 2480 that qualified as expanded likely voters! That means there are more people who say they are going to vote and who have voted in the past than there are people who say they are going to vote. Seems counterintuitive to me. Makes me wonder if they computed their percentages using the wrong numbers.

Meschatons said...

McCain Landslide - The Puma Factor
http://www.marstonchronicles.info/index.php?option=com_content&task=view&id=94&Itemid=118

TOTWTYTR said...

If the professor you mentioned is the one who writes "Back Talk", I hope you are right and he is wrong. He was exactly right in 2006 about the Republicans losing both the House and the Senate.

That being said, in my own unscientific way I've observed what you have. Here in ultra liberal Massachusetts I've seen a much higher number of McCain/Palin signs and bumper stickers than I would expect. Which doesn't mean that will even come close in this state, but it if there is any enthusiasm for them in this state, it might translate into a lot of enthusiasm in other states.

If you believe, as I do, that the Main Stream Media is pushing for an Obama victory, then it's not a stretch to believe that the polls that they pay for are skewed to show Obama winning big. It's an attempt to suppress McCain/Palin supporters voting. At least it feels that way.

Anonymous said...

I appreciate and heartily agree with your comments. The wild and crazy predictions of the media-sponsored polls -- e.g. CBS/NYTimes -- are not surprising to me. I see them more as headline generators and propaganda tools rather than as serious polls. But I have always assumed that the "real" pollsters --- Gallup, Rasmussen, Zogby and a handful of others -- would toe the line and play it straight, if for no other reason than their reputations and their livelihoods depend on their accuracy. I suppose they may still show a late surge for McCain to tighten the race. Or if they call for an Obama nationwide win of six points, and McCain wins by three, they can always cover themselves by saying the voters changed their minds at the last minute, or they could point to the "Bradley Effect". But then I ask, what good are pollsters at all?