I became interested in polls back when the first George Bush
ran for President. I had noticed how
Reagan thumped Carter in 1980 after the polls said he was behind, and noticed
how odd the polls looked at various times in subsequent elections. So I started paying attention to polling
agencies and tried to understand how they work.
There’s a certain mystic quality to polls, you know. While they carefully state that polls are not
predictive, polling groups well understand the influence they carry in shaping
opinion as well as reporting it. Human
behavior is especially sensitive to Heisenberg’s warning that the act of
observation itself changes behavior of the observed. The idea that you can ask a few people about
something and get a detailed report on what everyone is thinking, seems a bit
far-fetched. Which brings us to
Statistics and the importance of getting details right.
Statistics is all about numbers, but it’s also about
providing information on large amounts of information, kind of like the handle
on a suitcase, it gives you a way to get a grip to handle something large. Opinion polling is based on the very real
science of extrapolating large group behavior on known demographics and, well,
a certain herd mentality. Statistics is
the reason advertising works, for example.
A company finds out that people pay attention to certain sensory
triggers, and they tie their product in to those things. Credit ratings work the same way, on the
empirical record that people behave in ways which are generally
predictable.
Polls enjoy a certain protection from suspicion, as
well. They publish only summaries of
their results, on the claim that their details are proprietary. This is true, except that it also means no
one checks the process or integrity of most polls’ methodology. As a result, a poll could theoretically publish
whatever results it wished, manipulating numbers in reverse. In the case of election opinion polling, the
only way to know how well a poll performed is to compare its last release
before an election to the actual result, which says nothing about polls
released earlier in the campaign. And
boy howdy, sometimes polls earlier in a campaign are very odd.
Having said this, I pretty much reject it. Polls are in business to attract clients, and
you don’t succeed by publishing results that could blow up on you if your
employees revealed sloppy methods. But
it’s also true that sometimes polls change methods and get worse instead of
better. The problem comes down to your
business model. There are well-known
polls like Gallup and Rasmussen and Harris, which have been in the business for
literally decades, there are some universities and colleges which put out polls
somewhat irregularly, and there are a great many small, frankly unknown polling
agencies whose purpose in publishing election opinion polls seems to be in
hopes of attracting future clients.
There are also the private polling firms the public never hears about,
companies who perform polls for the specific and confidential use of their
clients. Both Romney and Obama use
private polling firms, which not only focus on selected demographics and
battleground states, but also track effectiveness of strategies and
tactics. The candidates do not disclose
their internal poll results (although you may hear rumors from time to time
from aides), but you can get a sense of their poll results by how confident and
ambitious the candidate appears and where they go to make appearances. This late in the race, candidates will only
make appearances in states where they believe their personal appearance will be
most effective.
Part of the problem of polling, aside from the small
start-ups with no history which pop up every election and pretend to be just as
reliable as the established polls, is laziness.
Very few of us are willing to print out and study all of the polls and
their internal data to weigh the election race in context. Most people just want to know who’s winning,
so they want a nice easy-to-read scoreboard.
And that has led us to aggregation.
What’s worse, people who really should know better also buy in to
aggregated summaries and sell them not only as accurate displays of the
election condition, but also in some cases as valid predictors of future
behavior. Frankly, this kind of behavior
is dishonest.
Statistics is a science, but since it studies human behavior
certain cautions are necessary.
Aggregation bias is a well-known problem in statistical analysis, as you
can see in studies by:
The National Institute of Health, regarding aggregation of
group level statistics
The Federal Reserve Bank, regarding aggregation errors in
economic behavior
The American Statistical Association, regarding covariance
analysis and aggregation distortion
The problem comes from forgetting that polls use very small
portions of the voter poll to ascertain opinion and trends. Aggregation amplifies whatever bias exists in
any given poll, and exacerbates error if a common assumption proves
faulty. This is why aggregation is
considered an invalid practice in statistical analysis. As much as I appreciate Real Clear Politics
for presenting a one-stop place to compare and consider polling, their RCP
Average is invalid as a true indicator of the election condition.
Aggregation is easy to read, though, and a lot of folks like
the idea that they understand what’s going on by just reading one website’s
release. Some of these bloggers have
enjoyed undeserved praise for piggy-backing off other people’s work, like Nate
Silver. Mister Silver developed a
formula which basically takes published poll results, weights them according to
his personal preference, aggregates the results and produces an average. He then assumes the formula is good for a
prediction of the election, which he publishes like a weather report. Silver recently announced that despite
setbacks since the start of October, President Obama still has, in his opinion,
a seventy percent chance of winning. Stop
and consider that for a moment, in its emotional impact. If the guy doing the weather on TV said there
was a seventy percent chance of rain tomorrow, you’d expect rain, naturally,
and possibly a heavy rain. At the very
least, you’d expect to see heavy clouds and some wind. No one would expect bright sunshine and blue
skies. So Mister Silver’s statement is a
clear warning that things are going poorly for the Romney campaign, dire and
bleak. Never mind that most of the
well-established national polls (Gallup, Rasmussen, ABC News/Washington Post)
not only give Romney a good chance, they actually show him winning right
now. The problem, you see, is that
Mister Silver is an Obama partisan, and he allows that to flavor his
reports.
In a way, I think I should like Nate Silver more than I
do. I’m something of a numbers guy in my
work, I love polls and politics, and I enjoy reading up on opinion polling in
particular and can appreciate the work that goes into an analysis, especially
an on-going one through a campaign lasting more than a year. But I learned in 2008 the hard way about
assumptions; after a bit of success in analyzing polls in 2004 for the
Bush-Kerry race as a writer for Polipundit, I made the mistake of trusting a
similar model in 2008. I realized after
the election that conditions are different for every election, and if you do
not adjust for the new conditions you will make mistakes, possibly serious
errors. Silver is doing the same thing
now, I think, his formula turned out great in 2008 and he thinks, why change
what’s working? The problem is, the
election this year is fundamentally different than the 2008 election, to which
I will return in a later post.
The reader might counter that most polls turn out to be generally
accurate (and that aggregators are therefore reliable), but that statement does
not hold up to close inspection. First,
consider that the only polls measured for accuracy are the final polls, which
generally come out in the last week of the campaign. Predicting the outcome that late is like
predicting the score of a football game in the last minute of the fourth
quarter; it’s not really something to brag about.
And polls don’t do all that well, actually, in terms of
precision. If you take the time to go
back and look at the polls’ historical error from actual results, the following
average comes out:
Gallup (last 10): 3.41
points average error
Harris (Last 10): 3.63
points average error
ABC/WaPo (9 elections):
3.05 points average error
CBS/NYT (9 elections):
6.37 points average error
USA Today (7 elections):
3.86 points average error
NBC/WSJ (6 elections):
5.73 points average error
Zogby (4 elections): 2.70
points average error
Pew (4 elections): 2.40
points average error
Battleground/GWU (4 elections): 5.75 points average error
Rasmussen (3 elections):
3.40 points average error
CNN (3 elections): 2.15
points average error
Newsweek (3 elections):
5.63 points average error
Fox (3 elections): 7.10
points average error
Marist (3 elections):
4.30 points average error
IBD (3 elections): 4.43
points average error
Ipsos (2
elections): 2.70 points average
error
ARG (2 elections): 2.20 points average error
TIME (2 elections):
3.80 points average error
LA Times (2 elections):
5.00 points average error
(final election poll variance from each candidate actual Presidential
election result, last ten such elections if available, data collected from
online site for each poll, corroborated where possible from RCP and
PollingReport.com for election year)
Given that most polls advertise a margin-of-error of about
3.5%, this data shows that of twenty national polls which published final releases
before elections, eleven average actual error rates beyond their published
margin of error, and two-thirds of all polls which have published final releases in six
or more of the last ten Presidential elections had average error rates greater
than their published margin of error.
While some polls seem to do much better in accuracy, keep in mind these
are the polls with less experience in election polling, and so their perceived
accuracy is possibly just a fluke. And,
as I said, these results are how the final poll performs; some of the earlier
releases are more than little bit suspect in their claims.
The technical term used for the discovery of this fact, is
“oops”. But you don’t see the polls
bring that up in their releases, even when the election is close enough that
these actual error margins eclipse the presented results from the poll.
1 comment:
Thanks to inflation, the prices of plots and houses accept added added in the contempo past. Because this, over the aftermost decade, the appeal for home loans has gone up. Aback you appeal to buy your dream abode and aback you are not able to align the appropriate money, loans from banks or any accustomed banking casework aggregation will do the ambush for you.Payday Loans
Post a Comment