More Evidence of Fraud in Iran: The Devil is in the Digits
From my new colleagues Bernd Beber and Alex Scacco:
In the past week, analysts have scoured the results from Iran’s presidential election, looking for evidence of fraud. In an op-ed in today’s Washington Post online, we offer a different take on the problem. The key idea in the piece is that people are poor randomizers: When humans try to fake numbers, they leave traces of their activity in the data. For instance, psychologists have found that people choose some digits more often than we would expect in a sequence of random numbers.
What distinguishes our approach from Walter Mebane’s note (discussed in a previous Monkey Cage post) is that we look at patterns of last digits in province-level vote returns. In a fair election, each number (0, 1, 2, etc.) should appear as often as any other in the last digit. But that’s not the case in the numbers from Iran: Among other things, wefind too many 7s and too few 5s. The deviations we find suggest there’s little chance the Iranian election results weren’t manipulated.
You can find the full op-ed here, as well as supplementary materials (including the annotated version of the op-ed and the data and code used in the analysis) here.
Comments
Wasn’t a similar claim once made about official (US military) casualty figures in the Vietnam war, i.e. a false veneer of precision was given by not ending figures with 5 or 0?
Posted by: James Conran | June 20, 2009 07:08 PM
I’m curious whether you actually agree with this analysis or are just passing it on. Looking at the penultimate digit of Obama/McCain 2008 results in their data set, I see the following:
7s: 20%
8s: 5%
The odds of this (1.5% in my simulation) are lower than the 17%/4% Iranian observation (about 3.5%). Furthermore, the standard deviation of frequencies is 0.0415, even higher than the last digit in Iran (the standard error is also higher).
Applying the same analysis to the penultimate digit in the 2008 US election is even more indicative of fraud.
Did the authors actually establish a “one number at least 17% and one number at most 4% in the last digit” test before looking at the data? Nothing in the annotated version of the article suggests as much. And their previous work on Nigerian elections data found a totally different phenomenon. If the underlying premise here is that humans will make specific, detectable flaws when asked to generate random numbers, why does the pattern change? Why do the authors discuss the behavior psychology behind nonadjacent number frequencies here, and not the evidence that shows that humans prefer lower numbers that was discussed in the Nigerian analysis?
It’s trivial to come up with dozens of equivalent rare events that could be observed in last digit frequencies. It’s completely unremarkable that these authors observed one of them, and it’s arrogant to state that this leaves little room for reasonable doubt.
Posted by: Zach | June 23, 2009 04:13 PM
This article provides a pretty good conceptual analysis of that study if you’re interested.
Posted by: James | June 27, 2009 02:08 PM