Just a few words about the Russian election. I read this entry http://www.badscience.net/2012/03/is-there-statistical-evidence-of-fraud-in-the-russian-election-data/ and thought to look for myself. For me it seems the data is not good enough to answer the fraud question.
Downloading data, reading and just look:
> r1 <- read.xls("xxxxxxxxxxxxxx")
> head(r1)
projecturl id updt region uik obstrusted INVALID VALID
1 http://sms.golos.org 1 38324.72 27 650 1 4 323
2 http://sms.golos.org 2 38689.09 25 216 0 9 927
3 http://sms.golos.org 3 38324.72 38 732 1 7 1282
4 http://sms.golos.org 4 38324.72 25 291 0 14 1185
5 http://sms.golos.org 5 38324.72 38 668 0 15 1510
6 http://sms.golos.org 6 38324.72 27 198 0 15 1889
Zhirinovsky Zyuganov Mironov Prokhorov Putin
1 42 40 3 24 214
2 88 229 58 92 460
3 80 333 46 150 673
4 129 315 67 175 499
5 76 395 70 227 742
6 127 353 115 379 915
Data looks good. Some unknown columns, region, VALID and the contenders look pretty straightforward.
Some regions occur once, others quite often. Some are completely missing
> regs <- xtabs(~ region,data=r1)
> names(regs[regs==1])
[1] "13" "32" "43" "65" "75" "86" "87"
Quite some difference in counts per region, as per the next plot. That is actually very odd, for someone not knowing about this field..
plot(xtabs(VALID ~ factor(region,levels=min(region):max(region)),data=r1))
And, if we think VALID=Zhirinovsky + Zyuganov + Mironov + Prokhorov + Putin, that is not true either.
r1$myValid <- with(r1, Zhirinovsky + Zyuganov + Mironov + Prokhorov + Putin)
plot(myValid ~ VALID,data=r1)
The data just do not add together.
Conclusion
The data is either not complete and contains too many questions to even think about looking for fraud, or this is the true data and it is so bad as seen here and the fraud is obvious.
No comments:
Post a Comment