November 28, 2024

Sarasota: Limited Investigations

As I wrote last week, malfunctioning voting machines are one of the two plausible theories that could explain the mysterious undervotes in Sarasota’s congressional race. To get a better idea of whether malfunctions could be the culprit, we would have to investigate – to inspect the machines and their software for any relevant errors in design or operation. A well-functioning electoral system ought to be able to do such investigations in an open and thorough manner.

Two attempts have been made to investigate. The first was by representatives of Christine Jennings (the officially losing candidate) and a group of voters, who filed lawsuits challenging the election results and asked, as part of the suits’ discovery process, for access by their experts to the machines and their code. The judge denied their request, in a curious order that seemed to imply that they would first have to prove that there was probably a malfunction before they could be granted access to the evidence needed to tell whether there was a malfunction.

The second attempt was by the Department of State (DOS) of the state of Florida, who commissioned a study by outside experts. Oddly, I am listed in the official Statement of Work (SOW) as a principal investigator on the study team, even though I am not a member of the team. Many people have asked how this happened. The short answer is that I discussed with representatives of DOS the possibility of participating, but eventually it became clear that the study they wanted to commission was far from the complete, independent study I had initially thought they wanted.

The biggest limitation on the study is that DOS is withholding information and resources needed for a complete study. Most notably, they are not providing access to voting machines. You don’t have to be a rocket scientist to realize that if you want to understand the behavior of voting machines, it helps to have a voting machine to examine. DOS could have provided or facilitated access to a machine, but it apparently chose not to do so. [Correction (Feb. 28): The team’s final report revealed that DOS had changed its mind and given the team access to voting machines.] The Statement of Work is clear that the study is to be “a … static software analysis on the iVotronics version 8.0.1.2 firmware source code”.

(In computer science, “static” analysis of software refers to methods that examine the text of the software; “dynamic” methods observe and measure the software while it is running.)

The good news is that the team doing the study is very strong technically, so there is some hope of a useful result despite the limited scope of the inquiry. There have been some accusations of political bias against team members, but knowing several members of the team I am confident that these charges are misguided and the team won’t be swayed by partisan politics. The limits on the study aren’t coming from the team itself.

The results of the DOS-sponsored study should be published sometime in the next few months.

What we have not seen, and probably won’t, is a full, independent study of the iVotronic machines. The voters of Sarasota County – and everyone who votes on paperless machines – are entitled to a comprehensive study of what happened. Sadly, it looks like lawyers and politics will stop that from happening.

Why So Many Undervotes in Sarasota?

The big e-voting story from November’s election was in Sarasota, Florida, where a congressional race was decided by about 400 votes, with 18,412 undervotes. That’s 18,412 voters who cast votes in other races but not, according to the official results, in that congressional race. Among voters who used the ES&S iVotronic machines – that is, non-absentee voters in Sarasota County – the undervote rate was about 14%. Something went very wrong. But what?

Since the election there have been many press releases, op-eds, and blog posts about the undervotes, not to mention some lawsuits and scholarly studies. I want to spend the rest of the week dissecting the Sarasota situation, which I have been following closely. I’m doing this now for two reasons: (1) enough time has passed for the dust to settle a bit, and (2) I’m giving a joint talk on the topic next week and I want to work through some thoughts.

There’s no doubt that something about the iVotronic caused the undervotes. Undervote rates differed so starkly in the same race between iVotronic and non-iVotronic voters that the machines must be involved somehow. (For example, absentee voters had a 2.5% undervote rate in the congressional race, compared to 14% for iVotronic voters.) Several explanations have been proposed, but only two are at all plausible: ballot design and machine malfunction.

The ballot design theory says that the ballot offered to voters on the iVotronic’s screen was misdesigned in a way that caused many voters to miss that race. Looking at screenshots of the ballot, one can see how voters might miss the congressional race at the top of the second page. (Depressingly, some sites show a misleading photo that the photographer angled and lit to make the misdesign look worse than it really was.) It’s very plausible that this kind of problem caused some undervotes; and that is consistent with the reports of many voters that the machine did not show them the congressional race.

It’s one thing to say that ballot design could have caused some undervotes, but it’s another thing entirely to say it was the sole cause of so elevated an undervote rate. Each voter, before finalizing his vote, was shown a clearly designed confirmation screen listing his choices and clearly showing a no-candidate-selected message for the congressional race. Did so many voters miss that too? And what about the many voters who reported choosing a candidate in the congressional race, only to have the no-candidate-selected message show up on the confirmation screen anyway?

The malfunction theory postulates a problem or malfunction with the voting machines that caused votes not to be recorded. There are many types of problems that could have caused lost votes. The best way to evaluate the malfunction theory is to conduct a careful and thorough study of the machines themselves. In the next entry I’ll talk about the efforts that have been made toward that end. For now, suffice it to say that no suitable study is available to us.

If we had a voter-verified paper trail, we could immediately tell which theory is correct, by comparing the paper and electronic records. If the voter-verified paper records show the same high undervote race, then the ballot design theory is right. If the paper and electronic records show significantly different undervote rates, then something is wrong with the machines. But of course the advocates of paperless voting argued that paper trails were unnecessary – while also arguing that touchscreen systems reduce undervotes.

Several studies have tried to use statistical analyses of undervote patterns in different races, precincts, and machines to evaluate the two theories. Frisina, Herron, Honaker, and Lewis say the data support the ballot design theory; Mebane and Dill say the data point to malfunction as a likely cause of at least some of the undervotes. Reading these studies, I can’t reach a clear conclusion.

What would convince me, one way or the other, is a good study of the machines. I’ll talk next time about the fight over whether and how to look at the machines.

Record Companies Boxed In By Their Own Rhetoric

Reports are popping up all over that the major record companies are cautiously gearing up to sell music in MP3 format, without any DRM (anti-copying) technology. This was the buzz at the recent Midem conference, according to a New York Times story.

The record industry has worked for years to frame the DRM issue, with considerable success. Mainstream thinking about DRM is now so mired in the industry’s framing that the industry itself will have a hard time explaining and justifying its new course.

The Times story is a perfect example. The headline is “Record Labels Contemplate Unrestricted Music”, and the article begins like this:

As even digital music revenue growth falters because of rampant file-sharing by consumers, the major record labels are moving closer to releasing music on the Internet with no copying restrictions — a step they once vowed never to take.

Executives of several technology companies meeting here at Midem, the annual global trade fair for the music industry, said over the weekend that at least one of the four major record companies could move toward the sale of unrestricted digital files in the MP3 format within months.

But of course the industry won’t sell music “with no copying restrictions” or “unrestricted”. The mother of all copying restrictions – copyright law – will still apply and will still restrict what people can do with the music files. I can understand leaving out a qualifier in the headline, where space is short. But in a 500-word article, surely a few words could have been spared for this basic point.

Why did the Times (and many commentators) mistake MP3 for “unrestricted”? Because the industry has created a conventional wisdom that (1) MP3 = lawless copying, (2) copyright is a dead letter unless backed by DRM, and (3) DRM successfully reduces copying. If you believe these things, then the fact that copyright still applies to MP3s is not even worth mentioning.

The industry will find these views particularly inconvenient when it is ready to sell MP3s. Having long argued that customers can’t be trusted with MP3s, the industry will have to ask the same customers to use MP3s responsibly. Having argued that DRM is necessary to its business – to the point of asking Congress for DRM mandates – it will now have to ask artists and investors to accept DRM-free sales.

All of this will make the industry’s wrong turn toward DRM look even worse than it already does. Had the industry embraced the Internet early and added MP3 sales to its already DRM-free CDA (Compact Disc Audio format) sales, they would not have reached this sad point. Now, they have to overcome history, their own pride, and years of their own rhetoric.

Wikipedia Leads; Will Search Engines NoFollow?

Wikipedia has announced that all of its outgoing hyperlinks will now include the rel=”nofollow” attribute, which instructs search engines to disregard the links. Search engines infer a page’s importance by seeing who links to it – pages that get many links, especially from important sites, are deemed important and are ranked highly in search results. A link is an implied endorsement: “link love”. Adding nofollow withholds Wikipedia’s link love – and Wikipedia, being a popular site, has lots of link love to give.

Nofollow is intended as an anti-spam measure. Anybody can edit a Wikipedia page, so spammers can and do insert links to their unwanted sites, thereby leeching off the popularity of Wikipedia. Nofollow will reduce spammers’ incentives by depriving them of any link love. Or that’s the theory, at least. Bloggers tried using nofollow to attack comment spam, but it didn’t reduce spam: the spammers were still eager to put their spammy text in front of readers.

Is nofollow a good idea for Wikipedia? It depends on your general attitude toward Wikipedia. The effect of nofollow is to reduce Wikipedia’s influence on search engine rankings (to zero). If you think Wikipedia is mostly good, then you want it to have influence and you’ll dislike its use of nofollow. If you think Wikipedia is unreliable and random, then you’ll be happy to see its influence reduced.

As with regular love, it’s selfish to withhold link love. Sometimes Wikipedia links to a site that competes with it for attention. Without Wikipedia’s link love, the other site will rank lower, and it could lose traffic to Wikipedia. Whether intended or not, this is one effect of Wikipedia’s action.

There are things Wikipedia could do to restore some of its legitimate link love without helping spammers. It could add nofollow only to links that are suspect – links that are new, or were added by an user without a solid track record on the site, or that have survived several rewrites of a page, or some combination of such factors. Even a simple policy of using nofollow for the first two weeks might work well enough. Wikipedia has the data to make these kinds of distinctions, and it’s not too much to ask for a site of its importance to do the necessary programming.

But the one element missing so far in this discussion is the autonomy of the search engines. Wikipedia is asking search engines not to assign link love, but the search engines don’t have to obey. Wikipedia is big enough, and quirky enough, that the search engines’ ranking algorithms probably have Wikipedia-specific tweaks already. The search engines have surely studied whether Wikipedia’s link love is reliable enough – and if it’s not, they are surely compensating, perhaps by ignoring (or reducing the weight of) Wikipedia links, or perhaps by a rule such as ignoring links for the first few weeks.

Whether or not Wikipedia uses nofollow, the search engines are free to do whatever they think will optimize their page ranking accuracy. Wikipedia can lead, but the search engines won’t necessarily nofollow.

AACS: Modeling the Battle

[Posts in this series: 1, 2, 3, 4, 5, 6, 7.]

By this point in our series on AACS (the encryption scheme used in HD-DVD and Blu-ray) it should be clear that AACS creates a nontrivial strategic game between the AACS central authority (representing the movie studios) and the attackers who want to defeat AACS. Today I want to sketch a model of this game and talk about who is likely to win.

First, let’s talk about what each party is trying to achieve. The central authority wants to maximize movie studio revenue. More precisely, they’re concerned with the portion of revenue that is due to AACS protection. We’ll call this the Marginal Value of Protection (MVP): the revenue they would get if AACS were impossible to defeat, minus the revenue they would get if AACS had no effect at all. The authority’s goal is to maximize the fraction of MVP that the studios can capture.

In practice, MVP might be negative. AACS makes a disc less useful to honest consumers, thereby reducing consumer demand for discs, which hurts studio revenue. (For example: Alex and I can’t play our own HD-DVD discs on our computers, because the AACS rules don’t like our computers’ video cards. The only way for us to watch these discs on our equipment would be to defeat AACS. (Being researchers, we want to analyze the discs rather than watch them, but normal people would insist on watching.)) If this revenue reduction outweighs any revenue increase due to frustrating infringement, MVP will be negative. But of course if MVP is negative then a rational studio will release its discs without AACS encryption; so we will assume for analytic purposes that MVP is positive.

We’ll assume there is a single attacker, or equivalently that multiple attackers coordinate their actions. The attacker’s motive is tricky to model but we’ll assume for now that the attacker is directly opposed to the authority, so the attacker wants to minimize the fraction of MVP that the studios can capture.

We’ll assume the studios release discs at a constant rate, and that the MVP from a disc is highest when the disc is first released and then declines exponentially, with time constant L. (That is, MVP for a disc is proportional to exp(-(t-t0)/L), where t0 is the disc’s release date.) Most of the MVP from a disc will be generated in the first L days after its release.

We’ll assume that the attacker can compromise a new player device every C days on average. We’ll model this as a Poisson process, so that the likelihood of compromising a new device is the same every day, or equivalently the time between compromises is exponentially distributed with mean C.

Whenever the attacker has a compromised device, he has the option of using that device to nullify the MVP from any set of existing discs. (He does this by ripping and redistributing the discs’ content or the keys needed to decrypt that content.) But once the attacker uses a compromised device this way, the authority gets the ability to blacklist that compromised device so that the attacker cannot use it to nullify MVP from any future discs.

Okay, we’ve written down the rules of the game. The next step – I’ll spare you the gory details – is to translate the rules into equations and solve the equations to find the optimal strategy for each side and the outcome of the game, that is, the fraction of MVP the studios will get, assuming both sides play optimally. The result will depend on two parameters: L, the commercial lifetime of a disc, and C, the time between player compromises.

It turns out that the attacker’s best strategy is to withhold any newly discovered compromise until a “release window” of size R has passed since the last time the authority blacklisted a player. (R depends in a complicated way on L and C.) Once the release window has passed, the attacker will use the compromise aggressively and the authority will then blacklist the compromised player, which essentially starts the game over. The studio collects revenue during the release window, and sometimes beyond the release window when the attacker gets unlucky and takes a long time to find another compromise.

The fraction of MVP collected by the studio turns out to be approximately C/(C+L). When C is much smaller than L, the studio loses most of the MVP, because the attacker compromises players frequently so the attacker will nullify a disc’s MVP early in the disc’s commercial lifetime. But when C is much bigger than L, a disc will be able to collect most of its MVP before the attacker can find a compromise.

To predict the game’s outcome, then, we need to know the ratio of C (the time needed to compromise a player) to L (the commercial lifetime of a disc). Unfortunately we don’t have good data to estimate C and L. My guess, though, is that C will be considerably less than L in the long run. I’d expect C to be measured in weeks and L in months. If that’s right, it’s bad news for AACS.