November 23, 2024

Open Government Data: Starting to Judge the Results

Like many others who read this blog, I’ve spent some time over the last year trying to get more civic data online. I’ve argued that government’s failure to put machine-readable data online is the key roadblock that separates us from a world in which exciting, Web 2.0 style technologies enrich nearly every aspect of civic life. This is an empirical claim, and as more government data comes online, it is being tested.

Jay Nath is the “manager of innovation” for the City and County of San Francisco, working to put municipal data online and build a community of developers who can make the most of it. In a couple of recent blog posts, he has considered the empirical state of government data publishing efforts. Drawing on data from Washington DC, where officials led by then-city CTO Vivek Kundra have put a huge catalog of government data online, he analyzed usage statistics and found an 80/20 pattern of public use of online government data — enormous interest in crime statistics and 311-style service requests, but relatively little about housing code enforcement and almost none about city workers’ use of purchasing credit cards. Here’s the chart: he made (larger version)

Note that this chart measures downloads, not traffic to downstream sites that may be reusing the data.

This analysis was part of a broader effort in San Francisco to begin measuring the return on investments in open government data. One simple measure, as many have remarked before, is foregone IT expenditures that are avoided when third party innovators make it unnecessary for government to provide certain services or make certain investments. But this misses what seems, intuitively, to be the lion’s share of the benefit: New value that didn’t exist before and is created by the extra functionality that third party innovators deliver, but government would not. Another approach is to measure government responsiveness before and after effectiveness data begin to be published. Unfortunately, such measures are unlikely to be controlled — if services get worse, for example, it may have more to do with budget cuts than with any victory, or failure, of citizen monitoring.

Open government data advocates and activists have allies on the inside in a growing number of governmental contexts, from city hall to the White House. But for these allies to be successful, they will need to be able to point to concrete results — sooner and more urgently in the current economic climate than they might have had to do otherwise. This holds a clear lesson for the activists: Small, tangible, steps that turn published government data into cost savings, measurable service improvements, or other concrete goods will “punch above their weight” : not only are they valuable in their own right, but they help favorably disposed civic servants make the case internally for more transparency and disclosure. Beyond aiming for perfection and thinking about the long run, the volunteer community would benefit from seeking low hanging fruit that will prove the concept of open government data and justify further investment.

Recovery Act Spending: Getting to the Bottom Line

Under most circumstances, government spending is slow and deliberate—a key fact that helps reduce the chances of waste and fraud. But the recently passed Recovery Act is a special case: spending the money quickly is understood to be essential to the success of the Act. We all know that shoppers in a hurry tend to get less value for their money. But, ironically, the overall macroeconomic impact of the stimulus (and hence the average stimulative effect per dollar spent) may be maximized by quick spending, even if the speed premium does increase the total amount of waste and abuse.

This situation creates a paradox for transparency and oversight efforts. On the one hand, the quicker pace of spending makes it all the more important to provide for public scrutiny, and to provide information in ways that will rapidly enable as many people as possible to take advantage of the stimulus opportunities available to them. On the other, the same rush that makes transparency important also reduces the time available for those within government to design and build an infrastructure for stimulus transparency.

One of the troubling tradeoffs that has been made thus far involves information about stimulus funds that flow from the federal government to states and then from states to localities. This pattern is rarer than you might think, since much of the Recovery Act spending flows more directly from federal agencies to its end recipients. But for funds that do follow a path from federal to state to local officials, recent guidance issued April 3 by the Office of Management and Budget (OMB) makes clear that the federal reporting infrastructure being created for Recovery.gov will not collect information about what the localities ultimately do with the funds.

OMB says that it does have the legal authority to require detailed reporting on “all levels of subawards,” reaching end recipients (Acme Concrete or whomever gets a contract or grant from the municipality at the end of the governmental chain). But in the context of its sprint to get at least some system into place as soon as possible (with the debut date for the Recovery.gov system already pushed back to October), OMB has left this deep-level reporting out of its immediate plans. The office says that it “plans to expand the reporting model in the future to also obtain this information, once the system capabilities and processes have been established.”

On Monday, ten congressmen sent a letter to OMB urging it to collect this detailed information “as early as possible.” One reason for OMB to formulate detailed operational plans in this area, as I argued in recent testimony before the House Committee on Oversight and Government Reform, is that clarity from the top will help states make competent choices about what if anything they should do to support or supplement the federal reporting. As the members of Congress write:

While it is positive that OMB goes on to reserve the right in the guidance to expand this reporting model in the future, it would seem exercising this right and requiring this level of reporting as early as possible would help entities prepare for the disclosures before projects begin and provide clarification for states as they begin investing in new infrastructure to track ARRA funds.

In the end, everyone agrees that this detailed information about subawards is important to have—OMB “plans to collect” it and the signatories to yesterday’s letter want collection to start “as soon as possible.” But how soon is that? We don’t really know. The details of hard choices facing OMB as it races to implement the Recovery.gov reporting system are themselves not public, and making them public might (or might not) itself slow down the development of the site. If no system were permitted to launch without fully detailed reporting of subawards, we might wait longer for the web site’s launch. How much longer? OMB might not itself be sure, since software development times are notoriously difficult to forecast, and OMB has never before been asked to build a system of this kind. OMB asserts that it’s moving as fast as it can to collect as much information as possible, and without slowing it down to ask for explanations, we can’t really check that assertion.

Transparency often reduces the degree to which citizens must trust public officials. But in this case, ironically, it seems most reasonable to operate on the optimistic but realistic assumption that the people working on Recovery Act transparency are doing their jobs well, and to hope for good results.

Government Online: Outreach vs. Transparency

These days everybody in Washington seems to be jumping on the Twitter bandwagon. The latest jumpers are four House committees, according to Tech Daily Dose.

The committees, like a growing number of individual members’ offices, plan to use Twitter as a new tool to reach their audience and ensure transparency between the government and the public.

“I believe government works best when it is transparent and information is accessible to all….” [said a committee chair].

I’m all in favor of public officials using technology to communicate with us. But Twitter is a tool for outreach, not transparency.

Here’s the difference: outreach means government telling us what it wants us to hear; transparency means giving us the information that we, the citizens, want to get. An ideal government provides both outreach and transparency. Outreach lets officials share their knowledge about what is happening, and it lets them argue for particular policy choices — both of which are good. Transparency keeps government honest and responsive by helping us know what government is doing.

Twitter, with its one-way transmission of 140-character messages, may be useful for outreach, but it won’t give us transparency. So, Congressmembers: Thanks for Twittering, but please don’t forget about transparency.

(Interestingly, the students in my tech policy class were surprised to hear that any of the digerati had ever Twittered. The students think of Twitter as a tool for aging hepcat techno-poseurs. [Insert your own joke here.])

Meanwhile, the Obama team is having trouble transitioning its famous online outreach machinery into government, according to Jose Antonio Vargas’s story in the Washington Post:

WhiteHouse.gov, envisioned as the primary vehicle for President Obama to communicate with the online masses, has been overwhelmed by challenges that staffers did not foresee and technological problems they have yet to solve.

Obama, for example, would like to send out mass e-mail updates on presidential initiatives, but the White House does not have the technology in place to do so. The same goes for text messaging, another campaign staple.

Beyond the technological upgrades needed to enable text broadcasts, there are security and privacy rules to sort out involving the collection of cellphone numbers, according to Obama aides, who acknowledge being caught off guard by the strictures of government bureaucracy.

Here again we see a difference between outreach and transparency. Outreach, by its nature, must be directed by government. But transparency, which aims to offer citizens the information they want, is best embodied by vigorous activity outside of government, enabled by government providing free and open access to data. As we argued in our Invisible Hand paper, many things are inherently more difficult to do inside of government, so the key role of government is to enable a marketplace of ideas in the private sector, rather than doing the whole job.

Kundra Named As Federal CIO

Today, the Obama administration named Vivek Kundra as the Chief Information Officer of the U.S. government, a newly created position.

This is great news. Kundra, in his previous role as CTO of the District of Columbia, made great strides in opening the DC government by publishing government data. When he spoke at our Thursday Forum last fall, everyone was impressed by how quickly and effectively he had transformed the DC government’s approach to technology.

First, he set up an open Data Catalog, where lots of data collected by the DC government is freely available in standard formats. Second, he ran the Apps for Democracy contest, in which he challenged citizens to develop applications to take advantage of all the data that the DC government is publishing. The results were impressive—with 47 different apps submitted by citizens—and also inexpensive.

Most impressively, in doing this he overcame the natural inertia of big city government. The Federal government will be even harder to budge, but with the right support from the top, Kundra could bring a new level of openness and tech-friendliness to the government.

Federal Health IT Effort Is Making Progress, Could Benefit from More Transparency

President Obama has indicated that health information technology (HIT) is an important component of his administration’s health care goals. Politicians on both sides of the aisle have lauded the potential for HIT to reduce costs and improve care. In this post, I’ll give some basics about what HIT is, what work is underway, and how the government can get more security experts involved.

We can coarsely break HIT into three technical areas. The first area is the transition from paper to electronic records, which involves surprisingly many subtle technical issues like interoperability. Second, development of health information networks will allow sharing of patient data between medical facilities and with other appropriate parties. Third, as a recent National Research Council report discusses, digital records can enable research in new areas, such as cognitive support for physicians.

HIT was not created on the 2008 campaign trail. The Department of Veterans Affairs (VA) has done work in this area for decades, including its widely praised VistA system, which provides electronic patient records and more. Notably, VistA source code and documentation can be freely downloaded. Many other large medical centers also already use electronic patient records.

In 2004, then-President Bush pushed for deployment of a Nationwide Health Information Network (NHIN) and universal adoption of electronic patient records by 2014. The NHIN is essentially a nationwide network for sharing relevant patient data (e.g., if you arrive at an emergency room in Oregon, the doctor can obtain needed records from your regular doctor in Kansas). The Department of Health and Human Services (HHS) funded four consortia to develop smaller, localized networks, partially as a learning exercise to prepare for the NHIN. HHS has held a number of forums where members of these consortia, the government, and the public can meet and discuss timely issues.

The agendas for these forums show some positive signs. Sessions cover a number of tricky issues. For example, participants in one session considered the risk that searches for a patient’s records in the NHIN could yield records for patients with similar attributes, posing privacy concerns. Provided that meaningful conversations occurred, HHS appears to be making a concerted effort to ensure that issues are identified and discussed before settling on solutions.

Unfortunately, the academic information security community seems divorced from these discussions. Whether before or after various proposed systems are widely deployed, members of the community are eventually likely to analyze them. This analysis would be preferable earlier. In spite of the positive signs mentioned, past experience shows that even skilled developers can produce insecure systems. Any major flaws uncovered may be embarrassing, but weaknesses found now would be cheaper and easier to fix than ones found in 2014.

A great way to draw constructive scrutiny is to ensure transparency in federally funded HIT work. Limited project details are often available online, but both high- and low-level details can be hard to find. Presumably, members of the NHIN consortia (for example) developed detailed internal documents containing use cases, perceived risks/threats, specifications, and architectural illustrations.

To the extent legally feasible, the government should make documents like these available online. Access to them would make the projects easier to analyze, particularly for those of us less familiar with HIT. In addition, a typical vendor response to reported vulnerabilities is that the attack scenario is unrealistic (this is a standard response of e-voting vendors). Researchers can use these documents to ensure that they consider only realistic attacks.

The federal agenda for HIT is ambitious and will likely prove challenging and expensive. To avoid massive, costly mistakes, the government should seek to get as many eyes as possible on the work that it funds.