Voter Personal Data Out in the Wild

I just saw an article on CBC about a “privacy breach” at Elections Ontario.  A set of “memory sticks” has gone walkabout, taking with it the names, addresses, birth dates, and genders of registered voters in 24 provincial ridings.  Oh dear.

The Chief Electoral Officer for the province says that the information was encrypted and that there is no evidence that the data were accessed.

Hold on.  If the memory sticks are missing, and therefore not available for inspection, then you have no information whatsoever on the state of their contents.  It is perfectly true that there is no evidence that the data were accessed… because you can’t check them to find out.

I could just as easily tell you that my autographed photo of Eddie Murphy is missing, and that there’s no evidence that someone else has crossed out my name and written “Bono” in permanent marker.  Indeed, I have no evidence of the portrait’s Bonofication, but that absence of evidence is not evidence of absence.  Damn you, Bono.

It would be nice if the Chief Electoral Officer gave out details of the encryption used on the memory sticks.  Without defining the method being used, the term “encryption” could mean anything from symmetric key crypto (AES, Twofish, etc.) to giving the files inconspicuous names (not_voter_data.dat) and everything in between.

There was also mention that the data may indicate whether the individual voter actually cast a ballot in the last election.  I have no issue with them tracking participation, but that data should not be stored with information that could identify individuals.  Information that uniquely identifies individual people must be kept segregated.  There should be no master database that contains all data points, big and small.  At a minimum, sensitive data should be stored in one database, non-sensitive data in another.  Each person would have a unique, randomly-assigned identifier (a number, for example), and that would be the common link between the two databases.  If the voter registry was leaked, you would know who the voters were, but not know if they voted.  If the record of voter participation was leaked, you would know which unique identifiers were associated with someone who voted, but not have any information about who that person was.  Obviously there will be occasions when you need some information from both databases (accomplished with a JOIN, in SQL) but the resulting mix of personal and non-personal would be temporary, not something that you would store on a USB stick.

At least, that’s how it should be done.  Let’s see what the morning briefing reveals…