|
||||
|
Democratic
societies are built around the principle of free and fair elections, that
each citizen’s vote should count equal. National elections can be regarded as
large-scale social experiments, where people are grouped into usually large
numbers of electoral districts and vote according to their preferences. The
large number of samples implies certain statistical consequences for the
polling results which can be used to identify election irregularities. Using
a suitable data collapse, we find that vote distributions of
elections with alleged fraud show a kurtosis of hundred times more than normal
elections. As an example we show that reported irregularities in the 2011 Duma election are indeed well explained by systematic
ballot stuffing and develop a parametric model quantifying to which extent
fraudulent mechanisms are present. We show that if specific statistical
properties are present in an election, the results do not represent the will
of the people. We formulate a parametric test detecting these statistical
properties in election results. For demonstration the model is also applied to election outcomes of several other
countries. Free and
fair elections are the cornerstone of every democratic society [1]. A central
characteristic of elections being free and fair is that each citizen’s vote
counts equal. However, already Joseph Stalin believed that ”The people who
cast the votes decide nothing. The people who count them decide everything.”
How can it be distinguished whether an election outcome represents the will
of the people or the will of the counters? Elections
are fascinating, large scale social experiments. A country is segmented into
a usually large number of electoral districts. Each district represents a
standardized experiment where each citizen articulates his/her political
preference via a ballot. Despite differences in e.g. income levels,
religions, ethnicities, etc. across the populations in these districts,
outcomes of these experiments have been shown to follow certain universal
statistical laws [2, 3]. Huge deviations from these expected distributions
have been reported for the votes for United Russia, the winning party in the
2011 Duma election [4, 5]. In
general, using an appropriate re-scaling of election data, the distributions
of votes and turnout are approximately a Gaussian [3]. Let Wi be the number of votes for the winning party and Ni
the number of voters in electoral district i, then
the logarithmic vote rate is
νi = log Wi−Ni Wi . In figure 2 we show the distribution of νi over all electoral districts. To first order the data from different
countries collapse to a Gaussian. Clearly the data for Russia and Uganda
boldly fall out of line. Skewness and kurtosis are
listed for each data-set in table SII, confirming these observations
quantitatively. Most
strikingly, the kurtosis of the distributions for Russia (2003, 2007 and
2011) and Uganda deviate by two orders of magnitude from each other country.
The only reasonable conclusion from this is that the voting results in Russia
and Uganda are driven by other mechanisms or processes than other countries. However,
such distributions only reveal part of the story, and a different
representation of the data becomes helpful to gain a deeper understanding.
Figure 1 shows a 2-d histogram of the number of electoral districts for a given
fraction of voter turnout (x-axis) and for the percentage of votes for the
winning party (y-axis). Results are shown for recent parliamentary elections
in Austria, Finland, Russia, Spain, Switzerland, and the UK, and presidential
elections in the USA and Uganda. Data was obtained from official election
homepages of the respective countries, for more details and more election
results, see SOM.
These figures can be interpreted as fingerprints of several processes and
mechanisms leading to the overall election results. For Russia and Uganda the
shape of these fingerprints are immediately seen to differ from the other
countries. In particular there is a large number of districts (thousands)
with a 100% percent turnout and at the same time a 100 % of votes for the
winning party. The shape
of these irregularities can be understood with the assumption of the presence
of the fraudulent action of ballot stuffing. This means that bundles of ballots
with votes for one party are stuffed into the urns. Videos purportedly
documenting these practices are openly available on online platforms [6–8].
In one case the urn is already filled with ballots before the elections
start, e.g. [6], in other cases members of the election commission are caught
filling out ballots, e.g. [7]. Yet in
another case the pens in the polling stations are shown to be erasable, e.g.
[8]. Are these incidents non-representative exceptions or the rule? We
develop a parametric model to quantify the extent of ballot stuffing for a given
party to explain the election fingerprints in figure 1. The distributions for
Russia and and
extreme fraud. Incremental fraud means that with a given rate ballots for one
party are added to the urn and votes for other parties are replaced. This
occurs within a fraction fi of electoral districts.
In the election fingerprints in figure 1 these districts are shifted to the
upper right. Extreme fraud corresponds to reporting nearly all votes for a
single party with an almost complete voter turnout. This happens in a
fraction fe of districts, which form a second
cluster near 100% turnout and votes for the incumbent party. For
simplicity in the model we assume that within each electoral district turnout
and voter preferences follow a Gaussian distribution with the mean and
standard deviation taken from the actual sample, see figure S2. With probability
fi (fe) the incremental
(extreme) fraud mechanisms are then applied. Note that if more detailed assumptions
are made about possible mechanisms leading to large-scale heterogeneities in
the data such as city/country differences in turnout (UK) or coast–non-coast (USA)
(see SOM), this will have an effect on the estimate of fi.
Figure 3 compares the observed and modelled fingerprint plots for the winning
parties in Russia, Uganda and Switzerland. Model results are shown for fi = fe = 0 (fair elections)
and for best fits to the data (see SOM) for fi and fe. To describe the smearing from the main peak to the
upper right corner, an incremental fraud probability around fi = 0.64 is needed for the case of United Russia. This
means fraud in about 64% of the districts. In the second peak around the 100%
turnout scenario there are roughly 3,000 districts with a 100% of votes for
United Russia representing an electorate of more than two million people.
Best fits yield fe
= 0.05, i.e. five percent of all electoral districts experience extreme
fraud. A more detailed comparison of the model performance for the Russian
parliamentary elections of 2003, 2007 and 2011 is found in the figure S3. The
fraud parameters for the Uganda data in figure 3 are fi
= 0.45 and fe = 0.01. The
dimension of election irregularities can be visualized with the cumulative
number of votes as a function of the turnout, figure 4. For each turnout
level the total number of votes from districts with this, or lower turnouts
are shown. Each curve corresponds to the respective election winner in a different
country. Normally these cdfs level off and form a
plateau from the party’s maximal vote count on. Again this is not the case
for Russia and Uganda. Both show a boost phase of increased extreme fraud toward
the right end of the distribution (red circles). Russia never even shows a
tendency to form a plateau. It is
imperative to emphasize that the shape of the fingerprints in figure 1 will deviate
from pure 2-d Gaussian distributions due to non-fraudulent mechanisms, such
as heterogeneities in the population or voter mobilization, see SOM. However,
these can under no circumstances explain the mode of extreme fraud. A bad
forgery is the ultimate insult1. It can be
said with almost certainty that an election does not represent the will of
the people, if a substantial fraction (fe) of
districts reports a 100% turnout with almost all votes for a single party,
and/or if any significant deviations from the sigmoid form in the cumulative
distribution of votes versus turnout are observed. Another indicator of
systematic fraudulent or irregular voting behaviour is a kurtosis of the
logarithmic vote rate distribution of the order of several hundreds. Should
such signals be detected it is tempting to invoke G.B. Shaw who held that democracy
is a form of government that substitutes election by the incompetent many for
appointment by the corrupt few.” FIG. 1.
Election fingerprints: 2-d histograms of the number of electoral districts
for a given voter turnout (x-axis) and the percentage of votes (y-axis) for
the winning party (or candidate) in recent elections from eight different
countries (from left to right, top to bottom: Austria, Finland, Russia, Spain,
Switzerland, Uganda, UK and USA) are shown. Colour represents the number of
electoral districts. Districts usually cluster around a given turnout and
voting level. In Uganda and Russia these clusters are ’smeared out’ to the
upper right region of the plots, reaching a second peak at a 100% turnout and
a 100% of votes (red circles). In Finland the main cluster is smeared out
into two directions (indicative of voter mobilization due to controversies
surrounding the True Finns). In the UK the fingerprint shows two clusters
stemming from rural and urban areas (see SOM). FIG. FIG. 3.
Comparison of observed and modelled 2-d histograms for (top to bottom)
Russia, Uganda and Switzerland. The left column shows the actual election fingerprints,
the middle column shows a fit with the fraud
model. The column to the right shows the expected model outcome of fair
elections (i.e. absence of fraudulent mechanisms fi
= fe = 0). For Switzerland the fair and fitted
model are almost the same. The
results for Russia and Uganda can only be explained by the model assuming a
large number of fraudulent districts. FIG. 4.
The ballot stuffing mechanism can be visualized by considering the cumulative
number of votes as a function of turnout. Each country’s election winner is
represented by a curve which typically takes the shape of a sigmoid function reaching
a plateau. In contrast to the other countries, Russia and Uganda do not tend
to develop this plateau but instead show a pronounced increase (boost) close
to complete turnout. Both
irregularities are indicative of the two ballot stuffing modes being present. SUPPORTING
ONLINE MATERIAL The data Descriptive
statistics and official sources of the election results are shown in table
SI. The raw data will be made available for download at http://www.complex-systems.meduniwien.ac.at/.
Each data set reports election results of parliamentary (Austria,
Finland, Russia, Spain, Switzerland and UK) or presidential (Uganda, USA)
elections on district level. In the rare circumstances where electoral
districts report more valid ballots than registered voters, we work with a
turnout of 100%. With the exception of the US data, each country reports the
number of registered voters and valid ballots for each party and district.
For the US there is no exact data on the voting eligible population on
district level, which was estimated to be the same as the population above 18
years, available at http://census.gov. Fingerprints for the 2000 US presidential
elections are shown in figure S1 for both candidates for districts from the
entire USA and Florida only. There are no irregularities discernible. Model A country
is separated into n electoral districts i, each having
an electorate of Ni people and in total Vi valid votes. The fraction of valid
votes for the winning party in district i is
denoted vi. The average turnout over all districts, ¯ a, is given by ¯ a =
1/nPi (Vi/Ni) with standard deviation sa, the mean
fraction of votes ¯ v for the winning party is ¯ v = 1/nPi vi with standard
deviation sv. The mean values ¯ a and ¯ v are
typically close to but not identical to the values which maximize the
empirical distribution function of turnout and votes over all districts. Let
v be the number of votes where the empirical distribution function assumes
its (first local) maximum (rounded to entire percents), see figure S2.
Similarly a is the turnout where the empirical distribution function of turnouts
ai takes its (first local) maximum. The distributions
for turnout and votes are extremely skewed to the right for Uganda and Russia
which also inflates the standard deviations in these countries, see table SI.
To account for this a ’left-sided’ (’right-sided’) mean deviation σLv (σRv ) from v is
introduced. σRv can be regarded as the
incremental fraud width, a measurable parameter quantifying how intense the
vote stuffing is. This contributes to the ’smearing out’ of the main peaks in
the election fingerprints, see figure σRv
=ph(vi − v)2ivi>v . (2) Similarly
the extreme fraud width σx can be estimated, i.e.
the width of the peak around 100% votes. We found that σx
= 0.075 describes all encountered vote distribution FIG. 1.
Turnout against percentage of votes for Bush (left column) and Gore (right)
in the 2000 US presidential elections. Results are shown for all districts in
the USA (top row) and for districts from Florida (bottom). There are no
traces of fraudulent mechanisms discernible in the fingerprints. FIG. |