August 23, 2019

Archives for August 2009

Introducing RECAP: Turning PACER Around

With today’s technologies, government transparency means much more than the chance to read one document at a time. Citizens today expect to be able to download comprehensive government datasets that are machine-processable, open and free. Unfortunately, government is much slower than industry when it comes to adopting new technologies. In recent years, private efforts have helped push government, the legislative and executive branches in particular, toward greater transparency. Thus far, the judiciary has seen relatively little action.

Today, we are excited to announce the public beta release of RECAP, a tool that will help bring an unprecedented level of transparency to the U.S. federal court system. RECAP is a plug-in for the Firefox web browser that makes it easier for users to share documents they have purchased from PACER, the court’s pay-to-play access system. With the plug-in installed, users still have to pay each time they use PACER, but whenever they do retrieve a PACER document, RECAP automatically and effortlessly donates a copy of that document to a public repository hosted at the Internet Archive. The documents in this repository are, in turn, shared with other RECAP users, who will be notified whenever documents they are looking for can be downloaded from the free public repository. RECAP helps users exercise their rights under copyright law, which expressly places government works in the public domain. It also helps users advance the public good by contributing to an extensive and freely available archive of public court documents.

The project’s website,, has all of the details– how to install RECAP, a screencast of the plug-in in action, more discussion of why this issue matters, and a host of other goodies.

The repository already has over one million documents available for free download. Together, with the help of RECAP users, we can recapture truly public access to the court proceedings that give our laws their practical meaning.

Anonymization FAIL! Privacy Law FAIL!

I have uploaded my latest draft article entitled, Broken Promises of Privacy: Responding to the Surprising Failure of Anonymization to SSRN (look carefully for the download button, just above the title; it’s a little buried). According to my abstract:

Computer scientists have recently undermined our faith in the privacy-protecting power of anonymization, the name for techniques for protecting the privacy of individuals in large databases by deleting information like names and social security numbers. These scientists have demonstrated they can often “reidentify” or “deanonymize” individuals hidden in anonymized data with astonishing ease. By understanding this research, we will realize we have made a mistake, labored beneath a fundamental misunderstanding, which has assured us much less privacy than we have assumed. This mistake pervades nearly every information privacy law, regulation, and debate, yet regulators and legal scholars have paid it scant attention. We must respond to the surprising failure of anonymization, and this Article provides the tools to do so.

I have labored over this article for a long time, and I am very happy to finally share it publicly. Over the next week, or so, I will write a few blog posts here, summarizing the article’s high points and perhaps expanding on what I couldn’t get to in a mere 28,000 words.

Thanks to Ed, David, and everybody else at Princeton’s CITP for helping me develop this article during my visit earlier this year.

Please let me know what you think, either in these comments or by .

Open Government Data: Starting to Judge the Results

Like many others who read this blog, I’ve spent some time over the last year trying to get more civic data online. I’ve argued that government’s failure to put machine-readable data online is the key roadblock that separates us from a world in which exciting, Web 2.0 style technologies enrich nearly every aspect of civic life. This is an empirical claim, and as more government data comes online, it is being tested.

Jay Nath is the “manager of innovation” for the City and County of San Francisco, working to put municipal data online and build a community of developers who can make the most of it. In a couple of recent blog posts, he has considered the empirical state of government data publishing efforts. Drawing on data from Washington DC, where officials led by then-city CTO Vivek Kundra have put a huge catalog of government data online, he analyzed usage statistics and found an 80/20 pattern of public use of online government data — enormous interest in crime statistics and 311-style service requests, but relatively little about housing code enforcement and almost none about city workers’ use of purchasing credit cards. Here’s the chart: he made (larger version)

Note that this chart measures downloads, not traffic to downstream sites that may be reusing the data.

This analysis was part of a broader effort in San Francisco to begin measuring the return on investments in open government data. One simple measure, as many have remarked before, is foregone IT expenditures that are avoided when third party innovators make it unnecessary for government to provide certain services or make certain investments. But this misses what seems, intuitively, to be the lion’s share of the benefit: New value that didn’t exist before and is created by the extra functionality that third party innovators deliver, but government would not. Another approach is to measure government responsiveness before and after effectiveness data begin to be published. Unfortunately, such measures are unlikely to be controlled — if services get worse, for example, it may have more to do with budget cuts than with any victory, or failure, of citizen monitoring.

Open government data advocates and activists have allies on the inside in a growing number of governmental contexts, from city hall to the White House. But for these allies to be successful, they will need to be able to point to concrete results — sooner and more urgently in the current economic climate than they might have had to do otherwise. This holds a clear lesson for the activists: Small, tangible, steps that turn published government data into cost savings, measurable service improvements, or other concrete goods will “punch above their weight” : not only are they valuable in their own right, but they help favorably disposed civic servants make the case internally for more transparency and disclosure. Beyond aiming for perfection and thinking about the long run, the volunteer community would benefit from seeking low hanging fruit that will prove the concept of open government data and justify further investment.