Sunday, February 03, 2008

Primaries, Polls, and Position = Potential Problems?

The polls this year are a bit strange. Real Clear Politics shows McCain leading Romney in ten out of eleven states going into Super Tuesday, where twenty-one states will hold a caucus or primary to choose the Republican nominee for President. This would appear to be a commanding lead for McCain, and many pundits have gone so far as to claim the race is already over. What’s odd about that, is that discussions with Republicans shows a strong distaste in the party for McCain as the nominee, and many people have insisted they would never vote for McCain. So, how is it that the polls say something which is so different from sentiment among real people? Is there a ‘silent majority’ for McCain? Is there an effort by the polls to get McCain nominated? Is someone lying? On the available evidence and the history of past elections, the answer to all three questions would appear to be ‘no’.

To examine this farther, let’s look at California’s primary, where 173 delegates are at stake. That’s more than twice what any of the delegates has right now, and it’s the richest prize in Super Tuesday’s fat wallet of nomination support. RCP’s average shows McCain 5.0 percentage points ahead of Romney.

But when we look a bit closer, a different picture begins to appear. The RCP average is built from the major polls taken between January 29 and February 2; of those McCain’s range of support runs from 32% (Rasmussen) to 40% (Mason-Dixon), while Romney’s range of support runs from 24% (Field) to 37% (Reuters/CSpan/Zogby), which creates a statistical overlap of 66% (McCain has a 9-point range of support, but only the top 3% is higher than poll numbers reached by Romney in the same period). This is important, because polls are not static conditions but fluid; they change all the time, and a report from a polling agency is a snapshot of one look from one perspective – this is one reason why different polls will report different results, even when using the same methodology. Also, it should be observed that while McCain has the lead in California in 4 of the 5 polls used by RCP for their current average, the poll with the highest number of respondents and therefore the lowest statistical margin of error (the Reuters/CSpan/Zogby poll, with 1185 Likely Voters; the next highest is Rasmussen with only 652 Likely Voters), shows Romney leading McCain in California 37% to 34%. That poll is also the most recent.

I am not going to jump to conclusions from the Reuters/CSpan/Zogby poll, however, for a number of reasons. First, while I am confident that the major polls are doing their best to produce an authentic reflection of voter sentiment, it is never a good idea to depend on just one poll for your conclusions. Also, the R/C/Z poll notes that 13% of the voters are undecided, which is far greater than the difference spanning the two top candidates. Also, I have found in years past that no matter what the stated margin of error is, an analyst should not regard any poll as a significant indicator, when the margin between two candidates is 3 points or less. That’s not only because the margin of error is often that large or greater, but also because fluctuations in support can – even in a short time – change the picture substantially. And in the case of the California primary, there is another detail which drastically damages the validity of any poll reported up to now; the congressional district allocation of delegates.

Here’s the thing; the primary will award 173 delegates, which makes California arguably the most important state in the primary campaign. But the candidate who gets the most votes in California does not get all of those 173 delegates, at least not just because he has the most votes in the state. In fact, it is completely possible that a candidate could “win” California through popular support, but claim fewer delegates than an opponent. This is because the delegates are awarded in California on a congressional district basis. Each district will tally up its results and award delegates on the basis, solely, from that district’s election results. That makes California not one state primary, but in real practice it is 53 separate mini-primaries, which collectively will award the delegates. Anyone familiar with California knows that there are liberal areas of the state, and there are conservative areas of the state, and there are nutjob areas of the state; it would be hopelessly na├»ve for anyone to expect one candidate to win in every one of the congressional districts. The results of the primary are far too complex for opinion polling to give a clear and confident answer about who will claim the most delegates there.

Why don’t the media, or even the polls, tell you this? Because they are in the media business. Polling groups exist not just to collect and analyze data for their clients (who, by the way, are not the general public – you should always be wary of any information you get from a for-profit group for free), but to build their company name and sell attention. A poll which never says anything unexpected or controversial will never gain prominence. Also, the media in general makes its living by making clear statements. They would much rather sell a good story and admit later that they were wrong (like New Hampshire’s primary this year), than admit how much of their predictions are pure guesswork and assumption. The idea is to keep you interested. This is why stories about Britney outnumber stories about Darfur, for example.

I’d like to finish with a final caveat about the polls. The best poll I have ever seen, based on its methodology, its transparency of procedure, and its explanation of results, is the Gallup poll. However, even the Gallup poll has limits to how good its data can be found. In discussing its methodology of recent primary polling, Gallup reported the following Methodology in its daily tracking polls:

Methodology: Gallup is interviewing 1,000 U.S. adults nationwide each day during 2008. The results reported here are based on combined data from Jan. 30-Feb. 1, 2008, including interviews with 1,080 Republican and Republican-leaning voters, and 1,205 Democratic and Democratic-leaning voters.”

The significance of that note comes from close examination of the statements. The respondents are not registered voters, or people who voted in the last election. In fact, to reach enough people for their poll results, the Republican and Democrat results are, well, diluted a bit to include “Republican-leaning” and “Democrat-leaning”, which when you think about it can be said to mean independents, and people who do not normally vote because they do not identify with either party’s values enough to call themselves Republican or Democrat directly. Thus, even the Gallup poll cannot be said to cleanly reflect the sentiments and opinions of mainstream Republicans and Democrats. Given the way the primaries are set up, these ‘Republican-leaning’ voters and ‘Democrat-leaning’ voters may be able to vote in their state’s caucus or primary, but history shows that people who do not strongly identify with a party or candidate are generally unlikely to take the time and trouble to vote. This is a final and significant point to consider when applying opinion poll reports to a projection of a primary election.

