File this one under "eternal vigilance." The Daily KOS has been running weekly polling results for about a year and a half. They recently parted company with the polling firm, Research 2000 (R2K), when FiveThirtyEight found that R2K had a very poor record. If you want to know which polls are reliable, click the previous link.
Perhaps the most famous R2K poll found that large numbers of Republican voters believe outlandish things, such as the birther myth.
Yesterday DKOS announced that there were much graver problems with some of the R2K polls than mere inaccuracy.
A bit over two weeks ago, a group of statistic wizards (Mark Grebner, Michael Weissman, and Jonathan Weissman) approached me with a disturbing premise -- they had been poring over the crosstabs of the weekly Research 2000 polling we had been running, and were concerned that the numbers weren't legit.
I immediately began cooperating with their investigation, which concluded late last week. Daily Kos furnished the researchers with all available and relevant information in our possession, and we made every attempt to obtain R2K's cooperation which… was not forthcoming.
The wizard's report makes for fascinating reading. It is a brilliant example of how an expert, and maybe even a layman, can determine that a statistical summary is bogus without any access to the underlying data. Consider this one sample from R2K:
Approval of: | Favorable | Unfavorable | Undecided |
| Men | Women | Men | Women | Men | Women |
Obama | 43 | 59 | 54 | 34 | 3 | 7 |
Pelosi | 22 | 52 | 66 | 38 | 12 | 10 |
Reid | 28 | 36 | 60 | 54 | 12 | 10 |
McConnell | 31 | 17 | 50 | 70 | 19 | 13 |
Boehner | 26 | 16 | 51 | 67 | 33 | 17 |
Cong. (D) | 28 | 44 | 64 | 54 | 8 | 2 |
Cong. (R) | 31 | 13 | 58 | 74 | 11 | 13 |
Party (D) | 31 | 45 | 64 | 46 | 5 | 9 |
Party (R) | 38 | 20 | 57 | 71 | 5 | 9 |
This is a pretty simple polling question: do you have a favorable view of X, an unfavorable view, or are you undecided. However, there is something very wrong with those numbers. Can you see it? Look carefully. I would like to think that I would have seen it if I had looked long enough, but I'll never know because I skipped ahead to the explanation. Give yourself a few minutes, and if it doesn't jump out, then…
Overall, the results are unsurprising. Men are more conservative than women. Adjusted for that, everyone thinks that both parties and both houses aren't worth a pitcher of warm spit. Now let me give you a clue.
Approval of: | Favorable | Unfavorable | Undecided |
| Men | Women | Men | Women | Men | Women |
Obama | 43 | 59 | 54 | 34 | 3 | 7 |
Pelosi | 22 | 52 | 66 | 38 | 12 | 10 |
Reid | 28 | 36 | 60 | 54 | 12 | 10 |
McConnell | 31 | 17 | 50 | 70 | 19 | 13 |
Boehner | 26 | 16 | 51 | 67 | 33 | 17 |
Cong. (D) | 28 | 44 | 64 | 54 | 8 | 2 |
Cong. (R) | 31 | 13 | 58 | 74 | 11 | 13 |
Party (D) | 31 | 45 | 64 | 46 | 5 | 9 |
Party (R) | 38 | 20 | 57 | 71 | 5 | 9 |
Now do you see the problem? I've highlighted all the even numbers. In every male/female comparison both numbers are even or both numbers are odd. There isn't a single case where the stat for men is even and that for women is odd, or vice versa. That is statistically all but impossible.
Consider a double coin toss. You either get Heads Heads, TT, HT, or TH. If you do it 27 times, you get about six or seven of each result; if, that is, the game is fair. If you get all HH and TT, something is very wrong. If you get all HH, you're using double headed coins.
Of course the numbers above are rounded off, but rounding has a fifty/fifty chance of turning an even number into an odd one or vice versa. No random sampling would have produced the even and odd pairs in the table above. It's not just 27 examples, unfortunately (for R2K).
Were the results in our little table a fluke? The R2K weekly polls report 778 M-F pairs. For their favorable ratings (Fav), the even-odd property matched 776 times. For unfavorable (Unf) there were 777 matches.
Common sense says that that result is highly unlikely, but it helps to do a more precise calculation. Since the odds of getting a match each time are essentially 50%, the odds of getting 776/778 matches are just like those of getting 776 heads on 778 tosses of a fair coin. Results that extreme happen less than one time in 10228. That's one followed by 228 zeros. (The number of atoms within our cosmic horizon is something like 1 followed by 80 zeros.) For the Unf, the odds are less than one in 10231. (Having some Undecideds makes Fav and Unf nearly independent, so these are two separate wildly unlikely events.)
The authors of the report were reasonably cautious in drawing conclusions, but I won't be. Research 2000 was reporting manifestly fraudulent results. This isn't a matter of distorting data, or asking misleading questions. It is a matter of making the data up.
There is no political scandal here. The Daily KOS, to its credit, exposed the fact that its own polls were compromised. I suspect that this is probably a simple case of professional incompetence followed by fraud. R2K, or someone in that firm, for whatever reason, couldn't deliver a genuine product on schedule. So they delivered a fake one. If you own R2K stock, I'd sell. Now.
It is a good reminder that you should always be suspicious. It is also really cool to look at something bogus and see it for what it is.
Update! The Politico is reporting that R2K polls were among the reasons that Blanche Lincoln was expected to lose the Arkansas Primary. If so, and these polls were fraudulent, that would be a major scandal.
Recent Comments