November 30, 2024

CITP is a Google Summer of Code 2010 Mentoring Organization

The Google Summer of Code program provides student stipends for summer work on open source projects. CITP is thrilled to have been chosen as a mentoring organization for 2010, meaning that students will be working on some CITP projects this summer. We think that these projects are very interesting, and potential participants now have the opportunity to propose their ideas for what they’d like to work on. Applications accepted from March 29 to April 9.

You can browse our list of project ideas, read our overall description, and apply here.

Side-Channel Leaks in Web Applications

Popular online applications may leak your private data to a network eavesdropper, even if you’re using secure web connections, according to a new paper by Shuo Chen, Rui Wang, XiaoFeng Wang, and Kehuan Zhang. (Chen is at Microsoft Research; the others are at Indiana.) It’s a sobering result — yet another illustration of how much information can be leaked by ordinary web technologies. It’s also really clever.

Here’s the background: Secure web connections encrypt traffic so that only your browser and the web server you’re visiting can see the contents of your communication. Although a network eavesdropper can’t understand the requests your browser sends, nor the replies from the server, it has long been known that an eavesdropper can see the size of the request and reply messages, and that these sizes sometimes leak information about which page you’re viewing, if the request size (i.e., the size of the URL) or the reply size (i.e., the size of the HTML page you’re viewing) is distinctive.

The new paper shows that this inference-from-size problem gets much, much worse when pages are using the now-standard AJAX programming methods, in which a web “page” is really a computer program that makes frequent requests to the server for information. With more requests to the server, there are many more opportunities for an eavesdropper to make inferences about what you’re doing — to the point that common applications leak a great deal of private information.

Consider a search engine that autocompletes search queries: when you start to type a query, the search engine gives you a list of suggested queries that start with whatever characters you have typed so far. When you type the first letter of your search query, the search engine page will send that character to the server, and the server will send back a list of suggested completions. Unfortunately, the size of that suggested completion list will depend on which character you typed, so an eavesdropper can use the size of the encrypted response to deduce which letter you typed. When you type the second letter of your query, another request will go to the server, and another encrypted reply will come back, which will again have a distinctive size, allowing the eavesdropper (who already knows the first character you typed) to deduce the second character; and so on. In the end the eavesdropper will know exactly which search query you typed. This attack worked against the Google, Yahoo, and Microsoft Bing search engines.

Many web apps that handle sensitive information seem to be susceptible to similar attacks. The researchers studied a major online tax preparation site (which they don’t name) and found that it leaks a fairly accurate estimate of your Adjusted Gross Income (AGI). This happens because the exact set of questions you have to answer, and the exact data tables used in tax preparation, will vary based on your AGI. To give one example, there is a particular interaction relating to a possible student loan interest calculation, that only happens if your AGI is between $115,000 and $145,000 — so that the presence or absence of the distinctively-sized message exchange relating to that calculation tells an eavesdropper whether your AGI is between $115,000 and $145,000. By assembling a set of clues like this, an eavesdropper can get a good fix on your AGI, plus information about your family status, and so on.

For similar reasons, a major online health site leaks information about which medications you are taking, and a major investment site leaks information about your investments.

The paper goes on to consider possible mitigations. The most obvious mitigation is to add padding to messages so that their sizes are not so distinctive — for example, every message might be padded to make its size a multiple of 256 bytes. This turns out to be less effective than you might expect — significant information can still leak even if messages are generously padded — and the padded messages are slower and more expensive to transmit.

We don’t know which sites the researchers studied, but it seems like a safe bet that most, if not all, of the sites in these product categories have similar problems. It’s important to keep these attacks in perspective — bear in mind that they can only be carried out by someone who can eavesdrop on the network between you and the site you’re visiting.

It’s becoming increasingly clear that securing web-based applications is very difficult, and that the basic tools for developing web apps don’t do much to help. The industry, and researchers, will be struggling with web app security issues for years to come.

Domain Names Can't Defend Themselves

Today, the Kentucky Supreme Court handed down an opinion in the saga of Kentucky vs. 141 Domain Names (described a while back here on this blog). Here’s the opinion.

This case is fascinating. A quick recap: Kentucky attempted a property seizure of 141 domain names allegedly involved in gambling on the theory that the domain names themselves constituted “gambling devices” under Kentucky law and were therefore illegal. The state held a forfeiture hearing where anyone with an interest in the “property” could show up to defend their interest in the property; otherwise, the State would order the registrars to transfer “ownership” of the domain names to Kentucky. No individual claiming that they own one of the domain names showed up. Litigation began when two industry associations (iMEGA and IGC) claimed to represent unnamed persons who owned these domain names (and another lawyer showed up during litigation claiming representation of one specific domain name).

The subsequent litigation gets a bit complicated; suffice it to say that the issue of standing was what got to the KY Supreme Court: could an association that claimed it represented an owner of a domain name affected in this action properly represent this owner in court without identifying that owner and that the owner was indeed the owner of an affected domain name?

The Kentucky Supreme Court said no, that there needs to be at least one identified individual owner that will suffer harm before the association can stand in stead, ruling,

Due to the incapacity of domain names to contest their own seizure and the inability of iMEGA and IGC to litigate on behalf of anonymous registrants, the Court of Appeals is reversed and its writ is vacated.

And on the issue of whether a piece of property can represent itself:

“An Internet domain name does not have an interest in itself any more than a piece of land is interested in its own use.”

Anyway, it would seem that the options for next steps include, 1) identifying at least one owner that would suffer harm, then motion back up to the Supreme Court (given that merits had been argued at the Appeals level), or 2) decide that the anonymity of domain name ownership in this case is more important than the fight over this very weird seizure of domain names.

As a non-lawyer, I wonder if it’s possible to represent an owner as a John Doe with an affidavit of ownership of an affected domain name submitted.

UPDATE (2010-03-19T00:07:07 EDT): Check the comments for why a John Doe strategy won’t work when the interest in anonymity is to avoid personal liability rather than free expression.

A weird bonus for people that have read this far: if I open the PDF of the opinion on my Mac in Preview.app or Skim.app (two PDF readers), the “SPORTSBOOK.COM” entry in the listing of the parties on the first page is hyperlinked. However, I don’t see this in Adobe Acrobat Pro or Reader. Seems like the KY Supreme Court is, likely inadvertently, linking to one of the 141 domain names. Of course, Preview.app and Skim.app might be sharing the same library that causes this one URL to be linked… I’m not good-enough of a PDF sleuth to figure it out.

Round 2 of the PACER Debate: What to Expect

The past year has seen an explosion of interest in free access to the law. Indeed, something of a movement appears to be coalescing around the issue, due in no small part to the growing Law.gov effort (see the latest list of events). One subset of this effort is our work on PACER, the online document access system for the federal courts. We contend that access to electronic court records should be free (see posts from me, Tim, and Harlan). Our RECAP project helps make some of these documents more accessible, and has gained adoption far above our expectations. That being said, RECAP doesn’t solve the fundamental problem: the federal government needs to publish the full public record for free online. Today, this argument came from an unlikely source, the FCC’s National Broadband Plan.

RECOMMENDATION 15.1: the primary legal documents of the federal government should be free and accessible to the public on digital platforms. […]

– For the Judicial branch, this should apply to all judicial opinions.

[…] Finally, all federal judicial decisions should be accessible for free and made publicly available to the people of the United States. Currently, the Public Access to Court Electronic Records system charges for access to federal appellate, district and bankruptcy court records.[7] As a result, U.S. federal courts pay private contractors approximately $150 million per year for electronic access to judicial documents.[8] [Steve note: The correct figure is $150m over 10 years. However it is quite possible that the federal government as a whole spends $150m or more per year for access to case materials.] While the E-Government Act has mandated that this system change so that this information is as freely available as possible, little progress has been made.[9] Congress should consider providing sufficient funds to publish all federal judicial opinions, orders and decisions online in an easily accessible, machine-readable format.

[7] See Public Access To Court Electronic Records—Overview, http://pacer.psc.uscourts.gov/pacerdesc.html (last visited Jan. 7, 2010).
[8] Carl Malmud, President and CEO, Public.Resource. Org., By the People, Address at the Gov 2.0 Summit, Washington, D.C. 25 (Sept. 10, 2009), available at http://resource.org/people/3waves_cover.pdf
[9] See Letter from Sen. Joseph I. Lieberman to Carl Malamud, President and CEO, Public.Resources.Org (Oct. 13, 2009), available at http://bulk.resource.org/courts.gov/foia/gov.senate.lieberman_20091013_from.pdf

This issue is outside of the Commission’s direct jurisdiction, but the Broadband Plan is intended as a blueprint for the federal government as a whole. In that context, the notion of ensuring that primary legal materials are available for free online fits perfectly with a broader effort to make government digitally accessible. In a similar vein, a bill was introduced today by Rep. Israel. The Public Online Information Act, backed by the Sunlight Foundation, creates a new federal advisory committee to advise all three branches of government on how to make government information available online for free.

To establish an advisory committee to issue nonbinding government-wide guidelines on making public information available on the Internet, to require publicly available Government information held by the executive branch to be made available on the Internet, to express the sense of Congress that publicly available information held by the legislative and judicial branches should be available on the Internet, and for other purposes.

These two developments are the first of what I expect to be many announcements in the coming months, coming from places like the transparency caucus. These announcements will share a theme — there is a growing mandate for universal free access to government information, and judicial information is a key component of that mandate. These requirements will increasingly go to the heart of full free access to the public record, and will reveal the discrepancies between different branches in this regard.

The FCC’s language doesn’t quite get everything right. Most notably, the language focuses on opinions even though there are other components of the record that are key to the public’s understanding of the law. Opinions on PACER are already theoretically free, but the kludgy system for accessing them doesn’t include all of the opinions, isn’t indexable by search engines, and only gives a minimal amount of information about the case that each is a part of. Furthermore, the docket text required to understand the context, and the search functionality required to find the opinions both require a fee. Subsequent calls for free access to case materials will have to be more holistic than the opinions-only language of the Broadband Report.

The POIA language is also a step forward. A federal advisory committee is a good thing in the context of a branch that is more accustomed to the adversarial process than notice-and-comment. However, we will need much more concrete requirements before we will have achieved our goals.

In the context of these announcements, the Administrative Office of the Courts made their own announcement today. The Judicial conference has voted in favor of two measures that make incremental improvements on the current pay-wall model of access to PACER.

  • Adjust the Electronic Public Access fee schedule so that users are not billed unless they accrue charges of more than $10 of PACER usage in a quarterly billing cycle, in effect quadrupling the amount of data available without charge. Currently, users are not billed until their accounts total at least $10 in a one-year period.
  • Approve a pilot in up to 12 courts to publish federal district and bankruptcy court opinions via the Government Printing Office’s Federal Digital System (FDsys) so members of the public can more easily search across opinions and across courts.

These are minor tweaks on a fundamentally limited system. Don’t get me wrong — a world with these changes is better than a world without. It is slightly easier to avoid spending more than $10 in a given quarter than in a given year, but it’s nevertheless likely that you will do so unless you know exactly what you are looking for and retrieve only a few documents. It’s also good to establish precedent for GPO publishing case materials, but that doesn’t require a limited trial that could end in bureaucratic quagmire. The GPO can handle publishing many documents, and any reasonably qualified software engineer could figure out how to deliver them in short order. What’s more, the courts could provide universal free public access today, with zero engineering work: offer a single PACER login that is never billed or, better yet, just stop billing all accounts.

The next round of the PACER debate will be over whether or not we make a fundamental change in access to federal court records, or if we concede minor tweaks and call it a day.

Global Internet Freedom and the U.S. Government

Over the past two weeks I’ve testified in both the Senate and the House on how the U.S. should advance “Internet freedom.” I submitted written testimony for both hearings which can be downloaded in PDF form here and here. Full transcripts will become available eventually but meanwhile you can click here to watch the Senate video and here to watch the House video. In both hearings I advocated a combination of corporate responsibility through the Global Network Initiative backed up by appropriate legislation given that some companies seem reluctant to hold themselves accountable voluntarily; revision of export controls and sanctions; and finally, funding and support for tools, and technologies and activism platforms that will counter-act suppression of online speech.

Lawmakers are moving forward to support research and technical development. February 4th Rep. David Wu [D-OR] and Rep. Frank Wolf [R-VA] introduced the Internet Freedom Act of 2010, which would establish an Internet Freedom Foundation. The bill’s core section reads:

(a) ESTABLISHMENT OF THE INTERNET FREEDOM FOUNDATION. – The National Science Foundation shall establish the Internet Freedom Foundation. The Internet Freedom Foundation shall –

(1) award competitive, merit-reviewed grants, cooperative aggreements, or contracts to private industry, universities, and other research and development organizations to develop deployable technologies to defeat Internet suppression and censorship; and
(2) award incentive prizes to private industry, universities, and other research and development organizations to develop deployable technologies to defeat Internet suppression and censorship.

(b) LIMITATION ON AUTHORITY. – Nothing in this Act shall be interpreted to authorize any action by the United States to interfere with foreign national censorship in furtherance of law enforcement aims that are consistent with the International Covenant on Civil and Political Rights.

Whoever runs this foundation will have their work cut out for them in sorting out its strategies, goals, and priorities – and dealing with a great deal of thorny politics. The Falun Gong-affiliated Global Internet Freedom Consortium have been arguing that they were unfairly passed over for recent State Department grants which were given to other groups working on different tools that help you get around Internet blocking – “circumvention tools” as the technical community call them. For the past year they’ve been engaged in an aggressive campaign to lobby congress and the media to ensure they’ll get a slice of future funds. (For examples of the fruits of their media lobbying effort see here, here, and here).

But the unfortunate bickering over who deserves government funding more than whom has distracted attention from the larger question of whether circumvention on its own is sufficient to defeat Internet censorship and suppression of online speech. In his recent blog post, Internet Freedom: Beyond Circumvention my friend and former colleague Ethan Zuckerman warns against an over-focus on circumvention: “We can’t circumvent our way around internet censorship.” In short, he summarizes his main points:

– Internet circumvention is hard. It’s expensive. It can make it easier for people to send spam and steal identities.
– Circumventing censorship through proxies just gives people access to international content – it doesn’t address domestic censorship, which likely affects the majority of people’s internet behavior.
– Circumventing censorship doesn’t offer a defense against DDoS or other attacks that target a publisher.

While circumvention tools remain worthy of support as part of a basket of strategies, I agree with Ethan that circumvention is never going to be the silver bullet that some people make it out to be, for all the reasons he outlines in his blog post, which deserves to be read in full. As Ethan points out, as I pointed out in my own testimony, and as my research on Chinese blog censorship published last year has demonstrated, circumvention does nothing to help you access content that has been removed from the Internet completely – which is the main way that the Chinese government now censors the Chinese-language Internet. In my testimony I suggested several other tools and activities that require equal amount of focus:

  • Tools and training to help people evade surveillance, detect spyware, and guard against cyber-attacks.
  • Mechanisms to preserve and re-distribute censored content in various languages.
  • Platforms through which citizens around the world can share “opposition research” about what different governments are doing to suppress online speech, and collaborate across borders to defeat censorship, surveillance, and attacks in ad-hoc, flexible ways as new problems arise during times of crisis.

As Ethan puts it:

– We need to shift our thinking from helping users in closed societies access blocked content to helping publishers reach all audiences. In doing so, we may gain those publishers as a valuable new set of allies as well as opening a new class of technical solutions.

– If our goal is to allow people in closed societies to access an online public sphere, or to use online tools to organize protests, we need to bring the administrators of these tools into the dialog. Secretary Clinton suggests that we make free speech part of the American brand identity – let’s find ways to challenge companies to build blocking resistance into their platforms and to consider internet freedom to be a central part of their business mission. We need to address the fact that making their platforms unblockable has a cost for content hosts and that their business models currently don’t reward them for providing service to these users.

Which brings us to the issue of corporate responsibility for free expression and privacy on the Internet. I’ve been working with the Global Network Initiative for the past several years to develop a voluntary code of conduct centered on a set of basic principles for free expression and privacy based on U.N. documents like the Universal Declaration of Human Rights, the International Covenant on Civil and Political Rights, and other international legal conventions. It is bolstered by a set of implementation guidelines and evaluation and accountability mechanisms, supported by a multi-stakeholder community of human rights groups, investors, and academics all dedicated to helping companies do the right thing and avoid making mistakes that restrict free expression and privacy on the Internet.

So far, however, only Google, Microsoft, and Yahoo have joined. Senator Durbin’s March 2nd Senate hearing focused heavily on the question of why other companies have so far failed to join, what it would take to persuade them to join, and if they don’t join whether laws should be passed that induce greater public accountability by companies on free expression and privacy. He has written letters to 30 U.S. companies in the information and communications technology (ICT) sector. He expressed great displeasure in the hearing with most of their responses, and further disappointment that no company (other than Google which is already in the GNI) even had the guts to send a representative to testify in the hearing.  Durbin announced that he will “introduce legislation that would require Internet companies to take reasonable steps to protect human rights or face civil or criminal liability.” It is my understanding that his bill is still under construction, and it’s not clear when he will introduce it (he’s been rather preoccupied with healthcare and other domestic issues, after all).  Congressman Howard Berman (D-CA), who convened Wednesday’s House Foreign Affairs Committee hearing is also reported to be considering his own bill. Rep. Chris Smith (R-NJ), the ranking Republican at that hearing, made a plug for the Global Online Freedom Act of 2009, a somewhat revised version of a bill that he first introduced in 2006

I said at the hearing that the GNI probably wouldn’t exist if it hadn’t been for the threat of Smith’s legislation. I was not, however, asked my opinion on GOFA’s specific content. Since GOFA’s 2006 introduction I have critiqued it a number of times (see for example here, here, and here). As the years have passed – especially in the past year as the GNI got up and running yet most companies have still failed to engage meaningfully with it  – I have come to see the important role legislation could play in setting industry-wide standards and requirements, which companies can then tell governments they have no choice but to follow. That said, I continue to have concerns about parts of GOFA’s approach. Here is a summary of the current bill written by the Congressional Research Service (I have bolded the parts of concern):

5/6/2009–Introduced.
Global Online Freedom Act of 2009 – Makes it U.S. policy to: (1) promote the freedom to seek, receive, and impart information and ideas through any media; (2) use all appropriate instruments of U.S. influence to support the free flow of information without interference or discrimination; and (3) deter U.S. businesses from cooperating with Internet-restricting countries in effecting online censorship. Expresses the sense of Congress that: (1) the President should seek international agreements to protect Internet freedom; and (2) some U.S. businesses, in assisting foreign governments to restrict online access to U.S.-supported websites and government reports and to identify individual Internet users, are working contrary to U.S. foreign policy interests. Amends the Foreign Assistance Act of 1961 to require assessments of electronic information freedom in each foreign country. Establishes in the Department of State the Office of Global Internet Freedom (OGIF). Directs the Secretary of State to annually designate Internet-restricting countries. Prohibits, subject to waiver, U.S. businesses that provide to the public a commercial Internet search engine, communications services, or hosting services from locating, in such countries, any personally identifiable information used to establish or maintain an Internet services account. Requires U.S. businesses that collect or obtain personally identifiable information through the Internet to notify the OGIF and the Attorney General before responding to a disclosure request from an Internet-restricting country. Authorizes the Attorney General to prohibit a business from complying with the request, except for legitimate foreign law enforcement purposes. Requires U.S. businesses to report to the OGIF certain Internet censorship information involving Internet-restricting countries. Prohibits U.S. businesses that maintain Internet content hosting services from jamming U.S.-supported websites or U.S.-supported content in Internet-restricting countries. Authorizes the President to waive provisions of this Act: (1) to further the purposes of this Act; (2) if a country ceases restrictive activity; or (3) if it is the national interest of the United States.

My biggest concern has to do with the relationship GOFA would create between U.S. companies and the U.S. Attorney General. If the AG is made arbiter of whether content or account information requested by local law enforcement is for “legitimate law enforcement purposes” or not, that means the company has to share the information – which in the case of certain social networking services may include a great deal of non-public information about the user, who his or her friends are, and what they’re saying to each other in casual conversation. Letting the U.S. AG review the insides of this person’s account would certainly violate that user’s privacy. It also puts companies at a competitive disadvantage in markets where users – even those who don’t particularly like their own government – would consider an overly close relationship between a U.S. service provider and the U.S. government not to be in their interest. Take this hypothetical situation for example: An Egyptian college student decides to use a social networking site to set up a group protesting the arrest and torture of his brother. The Egyptian government demands the group be shut down and all account information associated with it handed over. In order to comply with GOFA, the company shares this student’s account information and all content associated with that protest group with the U.S. Attorney General. What is the oversight to ensure that this information is not retained and shared with other U.S. government agencies interested in going on a fishing expedition to explore friendships among members of different Egyptian opposition groups? Why should we expect that user to be ok with such a possibility?

Another difficult issue to get right – which gets even harder with the advent of cloud computing – is the question of where user data is physically housed. The Center for Democracy and Technology,(PDF), Jonathan Zittrain and others have discussed some of the regulatory difficulties of personally identifiable information and its location. In 2008 Zittrain wrote:

As Internet law rapidly evolves, countries have repeatedly and successfully demanded that information be controlled or monitored, even when that information is hosted outside their borders. Forcing US companies to locate their servers outside IRCs [Internet Restricting Countries] would only make their services less reliable; it would not make them less regulable.

If the goal of GOFA is to discourage US companies from violating human rights, then it will probably be successful. But if the goal of the Act is to make the Internet more free and more safe, and not just push rights violations on foreign companies, then more must be done.

Then there is the problem of Internet Restricting Country designations themselves. I have long argued that it is problematic to divide the world into “internet restricting countries” and countries where we can assume everything is just fine, not to worry, no human rights concerns present. First of all I think that the list itself is going to quickly turn into a political and diplomatic football which will be subject to huge amounts of lobbying and politics, and thus will be very difficult to add new countries to the list. Secondly, regimes can change fast: in between annual revisions of the list you can have a coup or a rigged election whose victors demand companies to hand over dissident account information and censor political information, but companies are off the hook – having “done nothing illegal.” Finally, while I am not drawing moral equivalence between Italy and Iran I do believe there is no country on earth, including the United States, where companies are not under pressure by government agencies to do things that arguably violate users’ civil rights. Policy that acknowledges this honestly is less likely to hurt U.S. companies in many parts of the world where the last thing they need is for people to be able to provide “documentary proof” that they are extensions of the U.S. government’s geopolitical agendas.

Therefore a more effective, ethically consistent and less hypocritical approach to the three problems I’ve described above would be to codify strict global privacy standards absolutely everywhere U.S. companies operate. Companies should be required by law to notify all users anywhere in the world in a clear, culturally and linguistically understandable way (not by trained lawyers but by normal people), exactly how and where their personally-identifying information is being stored and used and who has access to it under what circumstances. If users are better informed about how their data is being used, they can use better judgment about how or whether to use different commercial services – and seek more secure alternatives when necessary, perhaps even using some of the new tools and platforms run by non-profit activist organizations that Congress is hoping to fund. Congress could further bolster the privacy of global users of U.S. services by adopting something akin to the Council of Europe Privacy Convention.

Regarding censorship: again, as the Internet evolves further with semi-private social networking sites and mobile services we need to make sure that the information companies are required to share with the U.S. government doesn’t end up violating user privacy.  I am doubtful that government agenices in some of the democracies unlikely to be put on the “internet restricting countries” list can really be trusted not to abuse the systems of censorship and intermediary liability that a growing number of democracies are implementing in the name of legitimate law enforcement purposes. Thus on censorship I also prefer global standards. There is real value in making companies retain internal records of the censorship requests that they receive all around the world in the event of a challenge in U.S. court regarding the lawfulness of a particular act of censorship – a private right of action in U.S. court which GOFA or its equivalent would potentially enable. It’s also good to make companies establish clear and uniform procedures for how they handle censorship requests, so that they can prove if challenged in court that they are only responding to requests made in writing through official legal channels, rather than responding to requests that have no basis even in local law, despite claiming vaguely to the public that “we are only following local law.” Companies should be required to exercise maximum transparency with users about what is being censored, at whose behest, and according to which law exactly. Congress could, for example, mandate that the Chilling Effects Clearinghouse mechanism or something similar should be utilized globally for all content takedowns.
(Originally posted at my blog, RConversation.)