November 21, 2024

New USACM Poilcy Recommendations on Open Government

USACM is the Washington policy committee of the Association for Computing Machinery, the professional association that represents computer scientists and computing practitioners. Today, USACM released Policy Recommendations on Open Government. The recommendations offer simple, clear advice to help Congress and the new administration make government initiatives—like the pending recovery bill—transparent to citizens.

The leading recommendation is that data be published in formats that “promote analysis and reuse of the data”—in other words, machine-readable formats that give citizens, rather than only government, the chance to decide how the data will be analyzed and presented. Regular Freedom to Tinker readers may recall that we have made this argument here before: The proposed Recovery.gov should offer machine-readable data, rather than only government-issue “presentations” of it. Ed and I both took part in the working group that drafted these new recommendations, and we’re pleased to be able to share them with you now, while the issue is in the spotlight.

Today’s statement puts the weight of America’s computing professionals behind the push for machine-readable government data. It also sends a clear signal to the Executive Branch, and to Congress, that America’s computing professionals stand ready to help realize the full potential of new information technologies in government.

Here are the recommendations in full:

  • Data published by the government should be in formats and approaches that promote analysis and reuse of that data.
  • Data republished by the government that has been received or stored in a machine-readable format (such as as online regulatory filings) should preserve the machine-readability of that data.
  • Information should be posted so as to also be accessible to citizens with limitations and disabilities.
  • Citizens should be able to download complete datasets of regulatory, legislative or other information, or appropriately chosen subsets of that information, when it is published by government.
  • Citizens should be able to directly access government-published datasets using standard methods such as queries via an API (Application Programming Interface).
  • Government bodies publishing data online should always seek to publish using data formats that do not include executable content.
  • Published content should be digitally signed or include attestation of publication/creation date, authenticity, and integrity.

The (Ironic) Best Way to Make the Bailout Transparent

The next piece of proposed bailout legislation is called the American Recovery and Reinvestment Act of 2009. Chris Soghoian, who is covering the issue on his Surveillance State blog at CNET, brought the bill to my attention, particularly a provision requiring that a new web site called Recovery.gov “provide data on relevant economic, financial, grant, and contract information in user-friendly visual presentations to enhance public awareness of the use funds made available in this Act.” As a group of colleagues and I suggested last year in Government Data and the Invisible Hand, there’s an easy way to make rules like this one a great deal more effective.

Ultimately, we all want information about bailout spending to be available in the most user-friendly way to the broadest range of citizens. But is a government monopoly on “presentations” of the data the best way to achieve that goal? Probably not. If Congress orders the federal bureaucracy to provide a web site for end users, then we will all have to live with the one web site they cook up. Regular citizens would have more and better options for learning about the bailout if Congress told the executive branch to provide the relevant data in a structured machine-readable format such as XML, so many sites can be made to analyze the data. (A government site aimed at end users would also be fine. But we’re only apt to get machine-readable data if Congress makes it a requirement.)

Why does this matter? Because without the underlying data, anyone who wants to provide a useful new tool for analysis must first try to reconstruct the underlying numbers from the “user-friendly visual presentations” or “printable reports” that the government publishes. Imagine trying to convert a nice-looking graph back into a list of figures, or trying to turn a printed transcript of a congressional debate into a searchable database of who said what and when. It’s not easy.

Once the computer-readable data is out there—whether straightforwardly published by the government officials who have it in the first place, or painstakingly recreated by volunteers who don’t—we know that a small army of volunteers and nonprofits stands ready to create tools that regular citizens, even those with no technical background at all, will find useful. This group of volunteers is itself a small constituency, but the things they make, like Govtrack, Open Congress, and Washington Watch, are used by a much broader population of interested citizens. The federal government might decide to put together a system for making maps or graphs. But what about an interactive one like this? What about three-dimensional animated visualizations over time? What about an interface that’s specially designed for blind users, who still want to organize and analyze the data but may be unable to benefit as most of us can from visualizations? There might be an interface in Spanish, the second most common American language, but what about one in Tagalog, the sixth most common?

There’s a deep and important irony here: The best way for government data to reach the broadest possible population is probably to release it in a form that nobody wants to read. XML files are called “machine-readable” because they make sense to a computer, rather than to human eyes. Releasing the data that way—so a variety of “user-friendly presentations,” to match the variety of possible users, can emerge—is what will give regular citizens the greatest power to understand and react to the bailout. It would be a travesty to make government the only source for interaction with bailout data—the transparency equivalent of central planning. It would be better for everyone, and easier, to let a thousand mashups bloom.

Satyam and the Inadvertent Web

Satyam is one of the handful of large companies who dominate the IT outsourcing market in India, A week ago today, B. Ramalinga Raju, the company chairman, confessed to a years-long accounting fraud. More than a billion dollars of cash the company claimed to have on hand, and the business success that putatively generated those dollars, now appear to have been fictitious.

There are many tech policy issues here. For one, frauds this massive in high tech environments are a challenge and opportunity for computer forensics. For another, though we can hope this situation is unique, it may turn out to be the tip of an iceberg. If Satyam turns out to be part of a pattern of lax oversight and exaggerated profits across India’s high tech sector, it might alter the way we look at high tech globalization, forcing us to revise downward our estimates of high tech’s benefits in India. (I suppose it could be construed as a silver lining that such news might also reveal America, and other western nations, to be more globally competitive in this arena than we had believed them to be.)

But my interest in the story is more personal. I met Mr. Raju in early 2007, when Satyam helped organize and sponsor a delegation of American journalists to India. (I served as Managing Editor of The American at the time.) India’s tech sector wanted good press in America, a desire perhaps increased by the fact that Democrats who were sometimes skeptical of free trade had just assumed control of the House. It was a wonderful trip—we were treated well at others’ expense and got to see, and learn about, the Indian tech sector and the breathtaking city of Hyderabad. I posted pictures of the trip on Flickr, mentioning “Satyam” in the description, showed the pics to a few friends, and moved on with life.

Then came last week’s news. Here’s the graph of traffic to my flickr account: That spike represents several thousand people suddenly viewing my pictures of Satyam’s pristine campus.

When I think about the digital “trails” I leave behind—the flickr, facebook and twitter ephemera that define me by implication—there are some easy presumptions about what the future will hold. Evidence of raw emotions, the unmediated anger, romantic infatuation, depression or exhilaration that life sometimes holds, should generally be kept out of the record, since the social norms that govern public display of such phenomena are still evolving. While others in their twenties may consider such material normal, it reflects a life-in-the-fishbowl style of conduct that older people can find untoward, a style that would years ago have counted as exhibitionistic or otherwise misguided.

I would never, however, have guessed that a business trip to a corporate office park might one day be a prominent part of my online persona. In this case, I happen to be perfectly comfortable with the result—but that feels like luck. A seemingly innocuous trace I leave online, that later becomes salient, might just as easily prove problematic for me, or for someone else. There seems to be a larger lesson here: That anything we leave online could, for reasons we can’t guess at today, turn out to be important later. The inadvertent web—the set of seemingly trivial web content that exists today and will turn out to be important—may turn out to be a powerful force in favor of limiting what we put online.

Taking Advantage of Citizen Contrarians

In my last post, I argued that sifting through citizens’ questions for the President is a job best done outside of government. More broadly, there’s a class of input that is good for government to receive, but that probably won’t be welcome at the staff level, where moment-to-moment success is more of a concern than long-term institutional thriving. Tough questions from citizens are in this category. So is unexpected, challenging or contrarian citizen advice or policy input. A flood of messages that tell the President “I’m in favor of what you already plan to do,” perhaps leavened with a sprinkling of “I respectfully disagree, but still like you anyway,” would make for great PR, and better yet, since such messages don’t offer action guiding advice, they don’t actually drive any change whatsoever in what anyone in government—from the West Wing to the furthest corners of the executive branch—does.

Will the new administration set things up to run this way? I don’t know. Certainly, the cookie-cutter blandness of their responses to the first round of online citizen questions is not a promising sign. There’s no question that Obama himself sees some value in real, tough questions that come from the masses. But the immediate practical advantages of a choir that echoes the preacher may be a much more attractive prospect for his staff then the scrambling, search, and actual policy rethought that might have to follow tough questions or unexpected advice.

This outcome would be a lost opportunity precisely because there are pockets of untapped expertise, uncommon wisdom, and bright ideas out there. Surfacing these insights—the inputs that weren’t already going to be incorporated into the policy process, the thoughts that weren’t talking points during the campaign, the things we didn’t already know—is precisely what the new collaborative technologies have made possible.

On the other hand, in order for this to work, we need to be able to regard (at least some of) the surprising, unexpected or quirky citizen inputs as successes for the system that attracted them, rather than failures. We can already find out what the median voter thinks, without all these fancy new systems, and in any case, his or her opinion is unlikely to add new or unexpected value to the policy process.

Obamacto.org, a potential model for external sites that gather citizen input for government, has a leaderboard of suggested priorities for the new CTO, voted up by visitors to the site. The first three suggestions are net neutrality regulation, Patriot Act repeal and DMCA repeal—unsurprising major issues. Arguably, if enough people took part in the online voting, there would be some value in knowing how the online group had prioritized these familiar requests. But with the fourth item, things get interesting: it reads “complete the job on metrication that Ronald Reagan defunded.”

On the one hand, my first reaction to this is to laugh: Regardless of whether or not moving to the metric system would be a good idea, it’s something that doesn’t have nearly the political support today that would be needed in order for it to be a plausible priority for Obama’s CTO. Put another way, there’s no chance that persuading America to do this is the best use of the new administration’s political capital.

On the other hand, maybe that’s what these sorts of online fora are for: Changing which issues are on the table, and how we think about them. The netroots turned net neutrality into a mainstream political issue, and for all I know they (or some other constellation of political forces) could one day do the same for the drive to go metric.

Readers, commenters: What do you think? Are quirky inputs like the suggestion that Obama’s CTO focus on metrication a hopeful sign for the value new deliberative technologies can add in the political process? Or, are they a sign that we haven’t figured out how these technologies should work or how to use them?

Government Shouldn't "Help" Citizens Pick Tough Questions for Obama

A couple of weeks ago, Julian Sanchez at Ars Technica, Ben Smith at Politico and others noted a disturbing pattern on the incoming Obama administration’s Change.gov website: polite but pointed user-submitted questions about the Blagojevich scandal and other potentially uncomfortable topics were being flagged as “inappropriate” by other visitors to the site.

In less than a week, more than a million votes-for-particular-questions were cast. The transition team closed submissions and posted answers to the five most popular questions. The usefulness and interest of these answers was sharply limited: They reiterated some of the key talking points and platform language of Obama’s campaign without providing any new information. The transition site is now hosting a second round of this process.

It shouldn’t surprise us that there are, among the Presdient-elect’s many supporters, some who would rather protect their man from inconvenient questions. And for all the enthusiastic talk about wide-open debate, a crowdsourced system that lets anyone flag an item as inappropriate can give these few a perverse kind of veto over the discussion.

If the site’s operators recognize this kind of deliberative narrowing as a problem, there are ways to deal with it. One could require a consensus judgement of “inappropriateness” by a cross-section of Change.gov users that is large enough, or is diverse with respect to geography, time of visit, amount of past involvement in the site, or any number of other criteria before taking a question out of circulation. Questions that have been preliminarily flagged as inappropriate could enter a secondary moderation queue where their appropriateness can be debated, leading to a considered “up or down” vote on whether a given question belongs in the mix. The Obama transition team could even crowdsource this problem itself, looking for lay input (or input from experts at places like Digg) about how to make sure that reasonable-but-pointed questions stay in, while off topic, off color, or otherwise unacceptable ones remain out.

But what are the incentives of the new administration’s online team? They might well find it convenient, as Julian writes, to “crowdsource a dodge” to inconvenient questions–if the users of Change.gov adopt an expansive view of “inappropriateness,” the Obama team will likely benefit slightly from soft, supportive questions in the near term, though it will run the risk of allowing substantive problems, or citizen concerns, to fester over the longer term. And that tradeoff could hold much more appeal for the median administration staffer than it does for the median American.

In other words, having the administration’s own tech people manage the moderation of questions directed at the President may be like having the fox guard the henhouse. I agree that even this is much more open than recent past administrations, but I think the more interesting question here is what would be ideal.

I suspect this key plank of the new administration’s plans will never be able to be fully realized within government. The President needs to answer questions that a nonzero number of his most enthusiastic supporters are willing to characterize as “inappropriate.” And for that to happen, the online moderation needs to take place outside of .gov. A collective move toward one of the .org alternatives, for this key activity of sifting questions, would be a great first step. That way, the goal of finding tough but honest questions can plausibly sit paramount.