January 13, 2025

The (Ironic) Best Way to Make the Bailout Transparent

The next piece of proposed bailout legislation is called the American Recovery and Reinvestment Act of 2009. Chris Soghoian, who is covering the issue on his Surveillance State blog at CNET, brought the bill to my attention, particularly a provision requiring that a new web site called Recovery.gov “provide data on relevant economic, financial, grant, and contract information in user-friendly visual presentations to enhance public awareness of the use funds made available in this Act.” As a group of colleagues and I suggested last year in Government Data and the Invisible Hand, there’s an easy way to make rules like this one a great deal more effective.

Ultimately, we all want information about bailout spending to be available in the most user-friendly way to the broadest range of citizens. But is a government monopoly on “presentations” of the data the best way to achieve that goal? Probably not. If Congress orders the federal bureaucracy to provide a web site for end users, then we will all have to live with the one web site they cook up. Regular citizens would have more and better options for learning about the bailout if Congress told the executive branch to provide the relevant data in a structured machine-readable format such as XML, so many sites can be made to analyze the data. (A government site aimed at end users would also be fine. But we’re only apt to get machine-readable data if Congress makes it a requirement.)

Why does this matter? Because without the underlying data, anyone who wants to provide a useful new tool for analysis must first try to reconstruct the underlying numbers from the “user-friendly visual presentations” or “printable reports” that the government publishes. Imagine trying to convert a nice-looking graph back into a list of figures, or trying to turn a printed transcript of a congressional debate into a searchable database of who said what and when. It’s not easy.

Once the computer-readable data is out there—whether straightforwardly published by the government officials who have it in the first place, or painstakingly recreated by volunteers who don’t—we know that a small army of volunteers and nonprofits stands ready to create tools that regular citizens, even those with no technical background at all, will find useful. This group of volunteers is itself a small constituency, but the things they make, like Govtrack, Open Congress, and Washington Watch, are used by a much broader population of interested citizens. The federal government might decide to put together a system for making maps or graphs. But what about an interactive one like this? What about three-dimensional animated visualizations over time? What about an interface that’s specially designed for blind users, who still want to organize and analyze the data but may be unable to benefit as most of us can from visualizations? There might be an interface in Spanish, the second most common American language, but what about one in Tagalog, the sixth most common?

There’s a deep and important irony here: The best way for government data to reach the broadest possible population is probably to release it in a form that nobody wants to read. XML files are called “machine-readable” because they make sense to a computer, rather than to human eyes. Releasing the data that way—so a variety of “user-friendly presentations,” to match the variety of possible users, can emerge—is what will give regular citizens the greatest power to understand and react to the bailout. It would be a travesty to make government the only source for interaction with bailout data—the transparency equivalent of central planning. It would be better for everyone, and easier, to let a thousand mashups bloom.

CA SoS Bowen sends proposals to EAC

California Secretary of State Debra Bowen has sent a letter to Chair Gineen Beach of the US Election Assistance Commission (EAC) outlining three proposals that she thinks will markedly improve the integrity of voting systems in the country.

I’ve put a copy of Bowen’s letter here (87kB PDF).

Bowen’s three proposals are:

  • Vulnerability Reporting — The EAC should require that vendors disclose vulnerabilities, flaws, problems, etc. to the EAC as the system certification authority and to all the state election directors that use the affected equipment.
  • Uniform Incident Reporting — The EAC should create and adopt procedures that jurisdictions can follow to collect and report data about incidents they experience with their voting systems.
  • Voting System Performance Measurement — As part of the Election Day Survey, the EAC should systematically collect data from election officials about how voting systems perform during general elections.

In my opinion, each of these would be a welcome move for the EAC.

These proposals would put into place a number of essential missing elements of administering computerized elections equipment. First, for the users of these systems, election officials, it can be extremely frustrating and debilitating if they suspect that some voting system flaw is responsible for problems they’re experiencing. Often, when errors arise, contingency planning requires detailed knowledge about specific details of a voting system flaw. Without knowing as much as possible about the problem they’re facing, election officials can exacerbate the problem. At best, not knowing about a potential flaw can do what Bowen describes: doom the election official, and others with the same equipment, to repeatedly encounter the flaw in subsequent elections. Of course, vendors are the most likely to have useful information on a given flaw, and they should be required to report this information to both the EAC and election officials.

Often the most information we have about voting system incidents come from reports from local journalists. These reporters don’t tend to cover high technology too often; their reports are often incomplete and in many cases simply and obviously incorrect. Having a standardized set of elements that an election official can collect and report about voting system incidents will help to ensure that the data comes directly from those experiencing a given problem. The EAC should design such procedures and then a system for collecting and reporting these issues to other election officials and the public.

Finally, many of us were disappointed to learn that the 2008 Election Day survey would not include questions about voting system performance. Election Day is a unique and hard-to-replicate event where very little systematic data is collected about voting machine performance. The OurVoteLive and MyVote1 efforts go a long way towards actionable, qualitative data that can help to increase enfranchisement. However, self-reported data from the operators of the machinery of our democracy would be a gold mine in terms of identifying and examining trends in how this machinery performs, both good and bad.

I know a number of people, including Susannah Goodman at Common Cause as well as John Gideon and Ellen Theisen of VotersUnite!, who have been championing one or another of these proposals in their advocacy. The fact that Debra Bowen has penned this letter is a testament to the reason behind their efforts.

DRM In Retreat

Last week’s agreement between Apple and the major record companies to eliminate DRM (copy protection) in iTunes songs marks the effective end of DRM for recorded music. The major online music stores are now all DRM-free, and CDs still lack DRM, so consumers who acquire music will now expect it without DRM. That’s a sensible result, given the incompatibility and other problems caused by DRM, and it’s a good sign that the record companies are ready to retreat from DRM and get on with the job of reinventing themselves for the digital world.

In the movie world, DRM for stored content may also be in trouble. On DVDs, the CSS DRM scheme has long been a dead letter, technologically speaking. The Blu-ray scheme is better, but if Blu-ray doesn’t catch on, this doesn’t matter.

Interestingly, DRM is not retreating as quickly in systems that stream content on demand. This makes sense because the drawbacks of DRM are less salient in a streaming context: there is no need to maintain compatibility with old content; users can be assumed to be online so software can be updated whenever necessary; and users worry less about preserving access when they know they can stream the content again later. I’m not saying that DRM causes no problems with streaming, but I do think the problems are less serious than in a stored-content setting.

In some cases, streaming uses good old fashioned incompatibility in place of DRM. For example, a stream might use a proprietary format and the most convenient software for watching streams might lack a “save this video” button.

It remains to be seen how far DRM will retreat. Will it wither away entirely, or will it hang on in some applications?

Meanwhile, it’s interesting to see traditional DRM supporters back away from it. RIAA chief Mitch Bainwol now says that the RIAA is agnostic on DRM. And DRM cheerleader Bill Rosenblatt has relaunched his “DRM Watch” blog under the new title “Copyright and Technology“. The new blog’s first entry: iTunes going DRM-free.

Optical-scan voting extremely accurate in Minnesota

The recount of the 2008 Minnesota Senate race gives us an opportunity to evaluate the accuracy of precinct-count optical-scan voting. Though there have been contentious disputes over which absentee ballot envelopes to open, the core technology for scanning ballots has proved to be extremely accurate.

The votes were counted by machine (except for part of one county that counts votes by hand), then every single ballot was examined by hand in the recount.

The “net” accuracy of optical-scan voting was 99.99% (see below).
The “gross” accuracy was 99.91% (see below).
The rate of ambiguous ballots was low, 99.99% unambiguous (see below).

My analysis is based on the official spreadsheet from the Minnesota Secretary of State. I commend the Secretary of State for his commitment to transparency in the form of making the data available in such an easy-to-analyze format. The vast majority of the counties use the ES&S M100 precinct-count optical-scanners; a few use other in-precinct scanners.

I exclude from this analysis all disputes over which absentee ballots to open. Approximately 10% of the ballots included in this analysis are optically scanned absentee ballots that were not subject to dispute over eligibility.

There were 2,423,851 votes counted for Coleman and Franken. The “net” error rate is the net change in the vote margin from the machine-scan to the hand recount (not including change related to qualification of absentee ballot envelopes). This was 264 votes, for an accuracy of 99.99% (error, one part in ten thousand).

The “gross” error rate is the total number of individual ballots either added to one candidate, or subtracted from one candidate, by the recount. A ballot that was changed from one candidate to the other will count twice, but such ballots are rare. In the precinct-by-precinct data, the vast majority of precincts have no change; many precincts have exactly one vote added to one candidate; few precincts have votes subtracted, or more than one vote added, or both.

The recount added a total of 1,528 votes to the candidates, and subtracted a total of 642 votes, for a gross change of 2170 (again, not including absentee ballot qualification). Thus, the “gross” error rate is about 1 in 1000, or a gross accuracy of 99.91%.

Ambiguous ballots: During the recount, the Coleman and Franken campaigns initially challenged a total of 6,655 ballot-interpretation decisions made by the human recounters. The State Canvassing Board asked the campaigns to voluntarily withdraw all but their most serious challenges, and in the end approximately 1,325 challenges remained. That is, approximately 5 ballots in 10,000 were ambiguous enough that one side or the other felt like arguing about it. The State Canvassing Board, in the end, classified all but 248 of these ballots as votes for one candidate or another. That is, approximately 1 ballot in 10,000 was ambiguous enough that the bipartisan recount board could not determine an intent to vote. (This analysis is based on the assumption that if the voter made an ambiguous mark, then this ballot was likely to be challenged either by one campaign or the other.)

Caveat: As with all voting systems, including optical-scan, DREs, and plain old paper ballots, there is also a source of error from voters incorrectly translating their intent into the marked ballot. Such error is likely to be greater than 0.1%, but the analysis I have done here does not measure this error.

Hand counting: Saint Louis County, which uses a mix of optical-scan and hand-counting, had a higher error rate: net accuracy 99.95%, gross accuracy 99.81%.

Tech Policy Challenges for the Obama Administration

[Princeton’s Woodrow Wilson School asked me to write a short essay on information technology challenges facing the Obama Administration, as part of the School’s Inaugural activities. Here is my essay.]

Digital technologies can make government more effective, open and transparent, and can make the economy as a whole more flexible and efficient. They can also endanger privacy, disrupt markets, and open the door to cyberterrorism and cyberespionage. In this crowded field of risks and opportunities, it makes sense for the Obama administration to focus on four main challenges.

The first challenge is cybersecurity. Government must safeguard its own mission critical systems, and it must protect privately owned critical infrastructures such as the power grid and communications network. But it won’t be enough to focus only on a few high priority, centralized systems. Much of digital technology’s value—and, today, many of the threats—come from ordinary home and office systems. Government can use its purchasing power to nudge the private sector toward products that are more secure and reliable; it can convene standards discussions; and it can educate the public about basic cybersecurity practices.

The second challenge is transparency. We can harness the potential of digital technology to make government more open, leading toward a better informed and more participatory civic life. Some parts of government are already making exciting progress, and need high-level support; others need to be pushed in the right direction. One key is to ensure that data is published in ways that foster reuse, to support an active marketplace of ideas in which companies, nonprofits, and individuals can find the best ways to analyze, visualize, and “mash up” government information.

The third challenge is to maintain and increase America’s global lead in information technology, which is vital to our prosperity and our role in the world. While recommitting to our traditional strengths, we must work to broaden the reach of technology. We must bring broadband Internet connections to more Americans, by encouraging private-sector investment in high-speed network infrastructure. We must provide better education in information technology, no less than in science or math, to all students. Government cannot solve these problems alone, but can be a catalyst for progress.

The final challenge is to close the culture gap between politicians and technology leaders. The time for humorous anecdotes about politicians who “don’t get” technology, or engineers who are blind to the subtleties of Washington, is over. Working together, we can translate technological progress into smarter government and a more vibrant, dynamic private sector.