Wednesday, October 03, 2012

Polls, Conspiracies, Common Sense, and Arguments

Imagine I took a poll of delegates at the 2012 GOP National Convention. Let’s say I made sure the gender, race, age, education, and other demographics apart from party affiliation were carefully matched to voter norms. Would you be happy with such a poll? If you are a Republican, maybe, but obviously not if you are a Democrat. A lot of folks would be quick to point out that Mitt Romney would pretty much crush Barack Obama in such a poll. The flip side is also obvious; if I took a poll and asked questions of only delegates at the Democrats’ convention, the results again would be pretty predictable, and by any reasonable standard invalid as a predictor of what will happen in the actual election. Yet, a lot of polls are out with very odd political weighting to them, and it’s laughable to hear the excuses made that these polls should be considered accurate with such imbalances. According to Jay Cost, there is a disparity between polls which Cost calls a “bimodal distribution”,

 which Cost says results in a difference in poll results which “looks to be built around how many Democrats are included in the polling samples.” I both agree and disagree with Mr. Cost. I do agree that the polls are off-kilter, but I reject the notion of conspiracies on that count. Polls depend on accuracy for their reputation, so either they really believe they have it right, or a whole bunch of them will be making major changes between now and Election Day. Before I go into those polls, I should be clear that I believe Romney will win, for the following general reasons. First, the economy and overall national condition is such that I do not believe Democrats will have the kind of numbers they enjoyed in 2008. In fact, a USA Today article from last December warned that both Democrats and Republicans had suffered declines while Independents grew, but noted that Democrats losses were significantly more than two times Republican losses. Democrats might still hold an edge in affiliation, but it would be in decline, not growing.

An August article in Politico found that Democrat losses in voter registration are TEN TIMES those of Republicans.

 Here’s the thing. Voters can choose to stay home or they can choose to vote, but the ceiling depends on getting registered. While the Democrats were in great shape as a plurality in 2008, they simply don’t have the stuff to build the same level of turn-out this year. The careful reader will note, then, that if Republicans stayed at about the same level and the Democrats dropped, then the Independents will be a huge, possibly critical factor in this election. This is the second reason why Romney will win. Despite overall poll numbers favoring Obama, Rasmussen, Gallup, and Battleground all show Romney leading in Independent support. Narrow, but clear. The third reason is the incumbency rule. Back in 1989, Nick Panagakis noted that in 127 out of 155 measurable election races, the proclaimed ‘undecideds’ broke heavily for the challenger. As a result, a close race is considered bad news for the incumbent.

 At RCP, Sean Trende said the problem for an incumbent below 50 percent could be more serious, even if he is leading in the polls. Note also that he wrote this back in 2010, long before he could have known where Obama would be in the polls.

 Taken in total, then, Romney has a deeper base of support than the poll weighting reflects, he has small but significant leads in Independent voter support, and Obama’s weak support up to now indicates there is a faction of undecideds who, historically, will pull the lever for Romney more than Obama by something between four-to-one and seven-to-one. Romney wins. But that doesn’t answer why the polls show Obama ahead, especially if – as I have said – there is no conspiracy. That brings us back to how polls work. Michael Barone used to work as a pollster, and wrote an interesting article on polling last week.

 The following is the essence I took from his article: First, back in 1997 about 36 percent of the people contacted agreed to be polled. Pew says that response rate is down to 9 percent. It means that if everything else is equal, it’s four times as hard to get your respondent pool. It’s not hard to see that such a drastic change in voter behavior can affect the character of poll results. Next, polls were originally conducted face to face and in some places still are (most notably in exit polling). That practice was always the most expensive and difficult method, and some questions have come up about insuring that proper methods are followed, so this has generally been replaced by telephone polling. Telephone polling has come under some criticism, since so many people today don’t use landline phones, and by law cell phones cannot be machine-dialed, a major part of the Random-Digit-Dialing (or RDD) system which has been the industry norm for a generation. A number of polls have begun to mix online and cell phone polls with landline results, but there is no empirical support to show that mixing such methods does not invalidate the results. Statisticians are pretty picky about what constitutes a valid sample, and tossing respondents in willy-nilly any way you can get them just does not pass the smell test. This brings us back to the business of polling. While they want to get the numbers right, pollsters also want very much to bring in the Next Big Thing. You don’t see it in the published numbers, but a lot of the pollsters are playing around with ways to get more responses from a broader range of the public. The problem is the inherent demographics in the different contact methods. Cell phone users, for example, tend to be younger and more urban than landline phone owners, which trends to a heavier portion of democrats. Also, phones in general tend to urban responses more than suburban, due to lifestyle differences. As for computer polls, the simple fact that the respondents in such polls approach the pollster rather than wait to be approached is a major deviation from nominal behavior, and as such represent a red flag in terms of representing voters at large.

 To put it another way, if I am at home and my phone rings from someone I don’t know, I will probably answer anyway, unless I am in the middle of something. If my cell rings and I don’t recognize the number, I am not answering because I AM usually in the middle of something, like work or driving. So even though I have both a landline and cell phone, I won’t respond at all to a cell phone interview while I may do so at my home. The norms simply have not been established and therefore any mixed-methodology poll is simply not valid as an indicator of state or national opinion.

 Another thing to keep in mind is the different value of the polling agencies. I may disagree with one or another of the big names, but Gallup, Rasmussen, Pew and so on are well known and established. The state polls for this year? Seriously, what do you know about groups like Public Policy Polling (PPP), Gravis Marketing, WeAskAmerica, and so on? The fact is, many of the groups doing state polling this year were not around in 2004 or even 2008, so there’s no way to know how well they are doing for accuracy. The next thing to notice, is that while the big national polling groups are putting polls out every week or so, many of the state polling groups only put a poll out more than a month after the last one, making it difficult to track trends within a specific poll. RCP and other media get lazy and just toss all the results together, figuring the aggregate will be accurate, but that’s statistically invalid. Don’t just take my word for it, go look up the American Association for Public Opinion Research. If you look into their code of ethics, for example, you will notice that they expect their members to be clear about how they conduct their research, to correct distortions of fact or data, and to clearly show their methodology. Frankly, few public polling groups come close to that standard in actual practice.

 Most people would not know the name, but polling took a large step forward with the creation of the exit poll by Warren Mitofsky, who worked with the Census Bureau then later CBS News. By the late 1960s, Mitofsky had established a format to query voters as they left the polls, which has become the standard ever since. It’s significant to note that Mitofsky himself observed a “rather consistent pattern in the presidential contests toward overstating the Democrat’s share of the vote”.

It’s important to observe that Mitofsky did not blame the overstatement of Democrat support on a conspiracy or deliberate attempt to deceive, but in his opinion the phenomenon is “a difference in response rates” between candidates. Mitofsky noted that age and other demographics affect response rates by voters. In summary, we should start any look at the polls with a reminder that polls are always, to some degree, wrong. They are influenced by assumptions made, not only by the polling groups but also by the respondents themselves. Conspiracy theories should be treated with a generous dose of skepticism, but the reader should pay attention to the track record of past polls, not only at the end right before the election, but earlier in the campaign. 

With that in mind, here is a quick look at the salient comparisons (all poll results from Gallup): Since 1950, only two Democrats have been elected as President then stood for re-election; Jimmy Carter in 1980 and Bill Clinton in 1996. But the Democrat running for President representing the incumbent party has run in four additional elections; Adlai Stevenson in 1952, Lyndon Johnson in 1964, Hubert Humphrey in 1968, and Al Gore in 2000. This gives us six elections for examination.

 In 1952, Dwight Eisenhower led Adlai Stevenson wire to wire in the polls. The early September poll showed Ike ahead 47-39, October showed a 53-41 lead, and the final poll was 51-49 for Eisenhower, indicating a tight race. The actual results were 55-45 Eisenhower, better than two of the three months of polling.

 In 1964, LBJ led Goldwater wire to wire in the polls. The early September poll showed LBJ ahead 65-29, October showed a 62-32 lead, and the final poll was 64-36 for Johnson. The actual results were 61-39 Johnson, an easy win but closer than the polls expected.

 In 1968, Nixon and Humphrey went back and forth, with Wallace’s 3rd-party effort confusing things. The September poll showed Nixon ahead 43-31, it was 44-29 Nixon in October, and the final poll tightened up to a 43-42 lead for Nixon. The actual results in 1968 were 44-43 Nixon, so the final poll was right but the earlier ones were off.

 In 1980, President Carter appeared to be surviving the bad economy and Middle East crisis. He held a narrow 39-38 lead over Reagan in September, the two were tied in October, and Reagan took a tight 1-point lead in the final poll. But the actual result, 51-41 Reagan, showed the polls were just plain wrong.

 In 1996, Bill Clinton led Bob Dole wire to wire in the polls. Clinton held a 53-36 lead in September, a 57-32 lead in October, and the final poll had Clinton ahead 52-41. The actual results, again, showed a tighter race than the polls predicted. While Clinton’s 49-41 win was not a nail-biter, the results showed, once again, that the polls continued to favor the Democrat more than was actually the case.

 In 2000, the polls seem to have accurately reflected a tight race. George Bush led Al Gore 46-45 in September, the race was tied at 45 in early October, and the final poll had Bush ahead 48-46. The final 48-48 virtual tie (Gore a smidge over 48% Bush a smidge below 48%) appears to show the polls were right.

So, only one of the elections played out the way it appeared in even the October polls. Just something to keep in mind as you read press releases saying it’s already decided.

