April 17, 2014

avatar

New USACM Poilcy Recommendations on Open Government

USACM is the Washington policy committee of the Association for Computing Machinery, the professional association that represents computer scientists and computing practitioners. Today, USACM released Policy Recommendations on Open Government. The recommendations offer simple, clear advice to help Congress and the new administration make government initiatives—like the pending recovery bill—transparent to citizens.

The leading recommendation is that data be published in formats that “promote analysis and reuse of the data”—in other words, machine-readable formats that give citizens, rather than only government, the chance to decide how the data will be analyzed and presented. Regular Freedom to Tinker readers may recall that we have made this argument here before: The proposed Recovery.gov should offer machine-readable data, rather than only government-issue “presentations” of it. Ed and I both took part in the working group that drafted these new recommendations, and we’re pleased to be able to share them with you now, while the issue is in the spotlight.

Today’s statement puts the weight of America’s computing professionals behind the push for machine-readable government data. It also sends a clear signal to the Executive Branch, and to Congress, that America’s computing professionals stand ready to help realize the full potential of new information technologies in government.

Here are the recommendations in full:

  • Data published by the government should be in formats and approaches that promote analysis and reuse of that data.
  • Data republished by the government that has been received or stored in a machine-readable format (such as as online regulatory filings) should preserve the machine-readability of that data.
  • Information should be posted so as to also be accessible to citizens with limitations and disabilities.
  • Citizens should be able to download complete datasets of regulatory, legislative or other information, or appropriately chosen subsets of that information, when it is published by government.
  • Citizens should be able to directly access government-published datasets using standard methods such as queries via an API (Application Programming Interface).
  • Government bodies publishing data online should always seek to publish using data formats that do not include executable content.
  • Published content should be digitally signed or include attestation of publication/creation date, authenticity, and integrity.

Comments

  1. dgr says:

    Commenting on this post was initially, accidentally, disabled. It has now been re-enabled. Apologies for my error: The floor is now open.

    David

  2. Mitch Golden says:

    This is a continuation of a comment I made to David in an e-mail. My query was about where the requirement for open formats was in these recommendations.

    I guess I confused matters by using the term “non-proprietary” when I should have used the term “open”. (To my definition, none of the MS formats would truly qualify as such, because other than Microsoft’s own code, there are no perfect implementations of, say, Word. That is what was at issue in Massachusetts in 2005.)

    Without going afield on Microsoft, however, I must be missing something. Let’s consider that the government releases data in a format that is clearly not open, one that can only be used with some specific, but widely available, program for which the document structure is not publicly documented and is covered by trade secret or patent protection. Let’s call the program DRDoc. As I go through the recommendations, I don’t see which one this falls afoul of. If I am a government IT employee, trying to follow your recommendations, this is how I would read them:

    –> Data published by the government should be in formats and approaches that promote analysis and reuse of that data.

    Sure, anyone can get DRDoc and use this data.

    –> Data republished by the government that has been received or stored in a machine-readable format (such as as online regulatory filings) should preserve the machine-readability of that data.

    Sure. The data generally originates in DRDoc anyway.

    –> Information should be posted so as to also be accessible to citizens with limitations and disabilities.

    No problem. DRDoc has a mode for use by blind people.

    –> Citizens should be able to download complete datasets of regulatory, legislative or other information, or appropriately chosen subsets of that information, when it is published by government.

    We have carefully organized our site so that all the DRDoc files stored in the repository are easily locatable, and can be fetched in any group the user desires.

    –> Citizens should be able to directly access government-published datasets using standard methods such as queries via an API (Application Programming Interface).

    We have a nice mySQL front end that allows metadata queries, and returns an HTML page with links to the DRDoc files. These front ends are all machine queryable.

    –> Government bodies publishing data online should always seek to publish using data formats that do not include executable content.

    Sure. The DRDoc files contain no executable content and there are both Mac and PC versions of DRDoc.

    –> Published content should be digitally signed or include attestation of publication/creation date, authenticity, and integrity.

    We download all DRDoc files in encrypted, signed wrapper files.

    So, we’re good to go? And in 20 years when DRDoc no longer supports
    this format and no one else can write a program for it because the spec for the format was never released, are we out of luck?

  3. HaeB says:

    “There’s a political groundswell underway across the country to unlock government information and make it more available to citizens.”

    The above comment by “a systems integrator working on several ['information superhighway'] projects” dates from 1994. (http://bubl.ac.uk/archive/journals/edupage/940714.htm)