Sunday, September 26, 2004

The average Poll on the front page of www.pollingreport.com today

is a silly concept. Consider the average of 10 polls likely voter if avalable, with Nader if available. The average of these 10 polls gives Bush has 47.9% Kerry 43.6 % Nader + others roughly 2% leaving roughly 6.5 % undecided. The standard errof of the difference Bush-Kerry slightly more than 1% that is 3 or 4 % divided by the square root of 10.

Now when people like Charlie Cook warn us to listen to him and not just look who is ahead in the polls, an important point is that undecided voters tend to split against the incumbant. The rough line is WYSW(S)HG what you see is what (s)he gets
(did I just invent the worlds first absurdly pc acronym ?). If this were literally true, then the average polls would predict a Kerry victory with expected Kerry vote 50.1 % expected Kerry-Bush of 2.2%. Given the standard error this just barely rejects, at the 5 % level, the null that, assuming WYSW(S)HG the race is currently tied, assuming WYSW(S)HG the average polls says chance that Kerry would win if the race were held tomorrow is about 95% . Now WYSW(S)HG is very strong. Another way of looking at it is that if Kerry is expected to beat Bush 5 to 1 among the undecideds, the race is tied. To me this means that, even assuming that undecideds break against the incumbant, Bush is slightly ahead at the moment.

Of course the idea of the average poll is silly anyway.
If polls differ only because of sampling error, the average result of 14 polls would be very reliable. However, there is no reason to believe that polls have an unbiased sample, since different people have different probabilities of refusing to respond, to not have a land line phone and to never be home. There is no reason to believe that the average likely voter filter is perfect. It is possible that Gallup is right and everyone else is wrong so Kerry supporters are markedly less likely to vote than Bush supporters. The average also includes polls of registered voters, so the assumption is that the average likely voter filter over does it, so averaging a few polls with registered voters gets it right. Finally the 2% Nader is partly an average partly a guess that the truth is somewhere in between results in polls where people are prompted for Nader and polls wehere they volunteer "other" not that it matters much at the moment.

In all a silly excercies and one I plan to perform compulsively until election day.