[E-voting] Not random enough?

Adrian Colley aecolley at spamcop.net
Mon Mar 15 15:13:40 GMT 2004

On Mon, Mar 15, 2004 at 12:43:26PM -0000, Aengus Lawlor wrote:
> Adrian Colley <aecolley at spamcop.net> wrote:
>> A new thing is that I looked at the PTB's evaluation of the random
>> placement of ballot records within the ballot module.  It _isn't_
>> random enough.  If you know the preferences of voters #1 and #3, and
>> they're unique, then you can deduce voter #2's preferences with 100%
>> accuracy just from examining the order of ballot records on the
>> ballot module.  It doesn't matter how many ballots are stored on the
>> ballot module.  Apparently 80% of all ballots are unique.  I think
>> this is a real honest-to-God showstopper for the ESI2.

> Can you expand on that a bit, Adrian?

Sure.  On pages 40 and 41 of the PTB "Test report 2 - Voting machine
ESI2 - Software for elections in Ireland"
it says:

ptb> 8.8-1   The votes should be stored randomly in the ballot module.

ptb> MB: The votes are stored in the vote memory of the ballot module
ptb> using a timer-based method.  For the first vote a place is selected
ptb> randomly somewhere in the middle of the vote memory.  The following
ptb> votes are stored in front of or after the votes stored previously.
ptb> The decision between beginning and end is also made at random.

ptb> The timer used to generate the random place and the random decision
ptb> is initialised during start-up tests and then increased every 8
ptb> msec.  When the first vote is stored, the timer is read out to select
ptb> the place.  The timer can be thought of as a "pointer" pointing to
ptb> the middle part of the vote memory.  It points to the beginning of
ptb> the middle part when the machine is started and advances to the
ptb> next memory place every 8 msec.  It reaches the end after 16 sec
ptb> and starts again at the beginning.

ptb> To identify the first vote, the time difference between start and
ptb> readout of the timer must be known with an uncertainty of a few
ptb> msec.  The start-of-timer point is not equal to the switch-on of
ptb> the voting machine, and the readout point is not equal to the
ptb> pressing of the CAST VOTE(S) button.  Both points are not detectable
ptb> for an observer.

ptb> Knowledge of the storing method is no help in identifying the first
ptb> vote.

ptb> Since the first vote is unknown, all votes cannot be associated to
ptb> certain voters.

ptb> Summary: The requirement is fulfilled.  The votes are randomly stored.
ptb> They are stored so that nobody will be able to assign certain votes
ptb> to certain voters.  Knowledge of the storing method does not help
ptb> in identifying certain votes.

This means that the first ballot record is stored (pseudo)randomly, but
each subsequent ballot record is added to either the start or the end of
the existing block, so that all the ballot records form a contiguous
block in the module memory.

The PTB concludes that because a zero-knowledge outsider cannot tell
which ballot was cast first, that the start of the sequence can't be
found.  As mathematicians will recognise, the ballots are written in a
"random walk", which will generally centre itself around the origin.  So
there's a very high probability that the first ballot is near the
centre.  Of course, the first voter knows his/her ballot and can pick it
out easily.

The PTB goes on to suppose that no other ballots can be picked out
because the first one is unfindable.  It's dodgy reasoning.  Each ballot
record is written in one of only two places: start or end.  If you know
the values of a few of the votes and the position of those voters in
sequence, then you can use Bayesian inference to compute the probability
that a postulated order of voters is correct.  Where known votes are
close together, the probability is high.

It reaches a trivial limit (100% certainty) near the first vote.  There
are only four ways for the first three ballot records to be placed:

A:    R1 R2 R3
B:    R3 R1 R2
C:    R2 R1 R3
D:    R3 R2 R1

If you know the value of R1 and R3 (and voters #1 and #3 can collaborate
to discover this), and you can locate them in the ballot module, then
you can always recover the value of R2 by knowledge of its position.  In
the case of A and D, R2 must lie between R1 and R3; in the case of B or
C, R1 and R3 are adjacent, and R2 is the immediate neighbour of R1.

I said in my previous mail that 80% of ballots are unique, which has
since been shown to be an overestimate.  In fact, R1 and R3 don't have
to be random; there just has to be a single match when you search for R1
and R3 within 3 adjacent blocks.

So voters #1 and #3 can deterministically discover what voter #2's cast
preferences were, by inspection of the ballot module.  This is quite a
departure from the previous system.  Other voters are at risk of having
their immediately preceding/following voters discover their preferences
in the same way, but there's almost a 50% chance that it won't be clear
(and slightly more than a 50% chance that they will be certain).  It all
depends on the selection of start or finish for each ballot record.

There are two possible attitudes to this: (1) tough for voter #2, other
voters have a probability that their vote won't be disclosed; (2) it's
unconstitutional endangerment of secrecy for all voters, and
overwhelmingly so in the case of voter #2.  Guess which camp I'm in?

Oh yes, voters #2 and #3 can similarly collaborate to discover the value
of R1.


GPG 0x43D3AD19 17D2 CA6E A18E 1177 A361  C14C 29DB BA4B 43D3 AD19

More information about the E-voting mailing list