June 23, 2017

Archives for March 2010

Side-Channel Leaks in Web Applications

Popular online applications may leak your private data to a network eavesdropper, even if you’re using secure web connections, according to a new paper by Shuo Chen, Rui Wang, XiaoFeng Wang, and Kehuan Zhang. (Chen is at Microsoft Research; the others are at Indiana.) It’s a sobering result — yet another illustration of how much information can be leaked by ordinary web technologies. It’s also really clever.

Here’s the background: Secure web connections encrypt traffic so that only your browser and the web server you’re visiting can see the contents of your communication. Although a network eavesdropper can’t understand the requests your browser sends, nor the replies from the server, it has long been known that an eavesdropper can see the size of the request and reply messages, and that these sizes sometimes leak information about which page you’re viewing, if the request size (i.e., the size of the URL) or the reply size (i.e., the size of the HTML page you’re viewing) is distinctive.

The new paper shows that this inference-from-size problem gets much, much worse when pages are using the now-standard AJAX programming methods, in which a web “page” is really a computer program that makes frequent requests to the server for information. With more requests to the server, there are many more opportunities for an eavesdropper to make inferences about what you’re doing — to the point that common applications leak a great deal of private information.

Consider a search engine that autocompletes search queries: when you start to type a query, the search engine gives you a list of suggested queries that start with whatever characters you have typed so far. When you type the first letter of your search query, the search engine page will send that character to the server, and the server will send back a list of suggested completions. Unfortunately, the size of that suggested completion list will depend on which character you typed, so an eavesdropper can use the size of the encrypted response to deduce which letter you typed. When you type the second letter of your query, another request will go to the server, and another encrypted reply will come back, which will again have a distinctive size, allowing the eavesdropper (who already knows the first character you typed) to deduce the second character; and so on. In the end the eavesdropper will know exactly which search query you typed. This attack worked against the Google, Yahoo, and Microsoft Bing search engines.

Many web apps that handle sensitive information seem to be susceptible to similar attacks. The researchers studied a major online tax preparation site (which they don’t name) and found that it leaks a fairly accurate estimate of your Adjusted Gross Income (AGI). This happens because the exact set of questions you have to answer, and the exact data tables used in tax preparation, will vary based on your AGI. To give one example, there is a particular interaction relating to a possible student loan interest calculation, that only happens if your AGI is between $115,000 and $145,000 — so that the presence or absence of the distinctively-sized message exchange relating to that calculation tells an eavesdropper whether your AGI is between $115,000 and $145,000. By assembling a set of clues like this, an eavesdropper can get a good fix on your AGI, plus information about your family status, and so on.

For similar reasons, a major online health site leaks information about which medications you are taking, and a major investment site leaks information about your investments.

The paper goes on to consider possible mitigations. The most obvious mitigation is to add padding to messages so that their sizes are not so distinctive — for example, every message might be padded to make its size a multiple of 256 bytes. This turns out to be less effective than you might expect — significant information can still leak even if messages are generously padded — and the padded messages are slower and more expensive to transmit.

We don’t know which sites the researchers studied, but it seems like a safe bet that most, if not all, of the sites in these product categories have similar problems. It’s important to keep these attacks in perspective — bear in mind that they can only be carried out by someone who can eavesdrop on the network between you and the site you’re visiting.

It’s becoming increasingly clear that securing web-based applications is very difficult, and that the basic tools for developing web apps don’t do much to help. The industry, and researchers, will be struggling with web app security issues for years to come.

Domain Names Can't Defend Themselves

Today, the Kentucky Supreme Court handed down an opinion in the saga of Kentucky vs. 141 Domain Names (described a while back here on this blog). Here’s the opinion.

This case is fascinating. A quick recap: Kentucky attempted a property seizure of 141 domain names allegedly involved in gambling on the theory that the domain names themselves constituted “gambling devices” under Kentucky law and were therefore illegal. The state held a forfeiture hearing where anyone with an interest in the “property” could show up to defend their interest in the property; otherwise, the State would order the registrars to transfer “ownership” of the domain names to Kentucky. No individual claiming that they own one of the domain names showed up. Litigation began when two industry associations (iMEGA and IGC) claimed to represent unnamed persons who owned these domain names (and another lawyer showed up during litigation claiming representation of one specific domain name).

The subsequent litigation gets a bit complicated; suffice it to say that the issue of standing was what got to the KY Supreme Court: could an association that claimed it represented an owner of a domain name affected in this action properly represent this owner in court without identifying that owner and that the owner was indeed the owner of an affected domain name?

The Kentucky Supreme Court said no, that there needs to be at least one identified individual owner that will suffer harm before the association can stand in stead, ruling,

Due to the incapacity of domain names to contest their own seizure and the inability of iMEGA and IGC to litigate on behalf of anonymous registrants, the Court of Appeals is reversed and its writ is vacated.

And on the issue of whether a piece of property can represent itself:

“An Internet domain name does not have an interest in itself any more than a piece of land is interested in its own use.”

Anyway, it would seem that the options for next steps include, 1) identifying at least one owner that would suffer harm, then motion back up to the Supreme Court (given that merits had been argued at the Appeals level), or 2) decide that the anonymity of domain name ownership in this case is more important than the fight over this very weird seizure of domain names.

As a non-lawyer, I wonder if it’s possible to represent an owner as a John Doe with an affidavit of ownership of an affected domain name submitted.

UPDATE (2010-03-19T00:07:07 EDT): Check the comments for why a John Doe strategy won’t work when the interest in anonymity is to avoid personal liability rather than free expression.

A weird bonus for people that have read this far: if I open the PDF of the opinion on my Mac in Preview.app or Skim.app (two PDF readers), the “SPORTSBOOK.COM” entry in the listing of the parties on the first page is hyperlinked. However, I don’t see this in Adobe Acrobat Pro or Reader. Seems like the KY Supreme Court is, likely inadvertently, linking to one of the 141 domain names. Of course, Preview.app and Skim.app might be sharing the same library that causes this one URL to be linked… I’m not good-enough of a PDF sleuth to figure it out.

Round 2 of the PACER Debate: What to Expect

The past year has seen an explosion of interest in free access to the law. Indeed, something of a movement appears to be coalescing around the issue, due in no small part to the growing Law.gov effort (see the latest list of events). One subset of this effort is our work on PACER, the online document access system for the federal courts. We contend that access to electronic court records should be free (see posts from me, Tim, and Harlan). Our RECAP project helps make some of these documents more accessible, and has gained adoption far above our expectations. That being said, RECAP doesn’t solve the fundamental problem: the federal government needs to publish the full public record for free online. Today, this argument came from an unlikely source, the FCC’s National Broadband Plan.

RECOMMENDATION 15.1: the primary legal documents of the federal government should be free and accessible to the public on digital platforms. […]

– For the Judicial branch, this should apply to all judicial opinions.

[…] Finally, all federal judicial decisions should be accessible for free and made publicly available to the people of the United States. Currently, the Public Access to Court Electronic Records system charges for access to federal appellate, district and bankruptcy court records.[7] As a result, U.S. federal courts pay private contractors approximately $150 million per year for electronic access to judicial documents.[8] [Steve note: The correct figure is $150m over 10 years. However it is quite possible that the federal government as a whole spends $150m or more per year for access to case materials.] While the E-Government Act has mandated that this system change so that this information is as freely available as possible, little progress has been made.[9] Congress should consider providing sufficient funds to publish all federal judicial opinions, orders and decisions online in an easily accessible, machine-readable format.

[7] See Public Access To Court Electronic Records—Overview, http://pacer.psc.uscourts.gov/pacerdesc.html (last visited Jan. 7, 2010).
[8] Carl Malmud, President and CEO, Public.Resource. Org., By the People, Address at the Gov 2.0 Summit, Washington, D.C. 25 (Sept. 10, 2009), available at http://resource.org/people/3waves_cover.pdf
[9] See Letter from Sen. Joseph I. Lieberman to Carl Malamud, President and CEO, Public.Resources.Org (Oct. 13, 2009), available at http://bulk.resource.org/courts.gov/foia/gov.senate.lieberman_20091013_from.pdf

This issue is outside of the Commission’s direct jurisdiction, but the Broadband Plan is intended as a blueprint for the federal government as a whole. In that context, the notion of ensuring that primary legal materials are available for free online fits perfectly with a broader effort to make government digitally accessible. In a similar vein, a bill was introduced today by Rep. Israel. The Public Online Information Act, backed by the Sunlight Foundation, creates a new federal advisory committee to advise all three branches of government on how to make government information available online for free.

To establish an advisory committee to issue nonbinding government-wide guidelines on making public information available on the Internet, to require publicly available Government information held by the executive branch to be made available on the Internet, to express the sense of Congress that publicly available information held by the legislative and judicial branches should be available on the Internet, and for other purposes.

These two developments are the first of what I expect to be many announcements in the coming months, coming from places like the transparency caucus. These announcements will share a theme — there is a growing mandate for universal free access to government information, and judicial information is a key component of that mandate. These requirements will increasingly go to the heart of full free access to the public record, and will reveal the discrepancies between different branches in this regard.

The FCC’s language doesn’t quite get everything right. Most notably, the language focuses on opinions even though there are other components of the record that are key to the public’s understanding of the law. Opinions on PACER are already theoretically free, but the kludgy system for accessing them doesn’t include all of the opinions, isn’t indexable by search engines, and only gives a minimal amount of information about the case that each is a part of. Furthermore, the docket text required to understand the context, and the search functionality required to find the opinions both require a fee. Subsequent calls for free access to case materials will have to be more holistic than the opinions-only language of the Broadband Report.

The POIA language is also a step forward. A federal advisory committee is a good thing in the context of a branch that is more accustomed to the adversarial process than notice-and-comment. However, we will need much more concrete requirements before we will have achieved our goals.

In the context of these announcements, the Administrative Office of the Courts made their own announcement today. The Judicial conference has voted in favor of two measures that make incremental improvements on the current pay-wall model of access to PACER.

  • Adjust the Electronic Public Access fee schedule so that users are not billed unless they accrue charges of more than $10 of PACER usage in a quarterly billing cycle, in effect quadrupling the amount of data available without charge. Currently, users are not billed until their accounts total at least $10 in a one-year period.
  • Approve a pilot in up to 12 courts to publish federal district and bankruptcy court opinions via the Government Printing Office’s Federal Digital System (FDsys) so members of the public can more easily search across opinions and across courts.

These are minor tweaks on a fundamentally limited system. Don’t get me wrong — a world with these changes is better than a world without. It is slightly easier to avoid spending more than $10 in a given quarter than in a given year, but it’s nevertheless likely that you will do so unless you know exactly what you are looking for and retrieve only a few documents. It’s also good to establish precedent for GPO publishing case materials, but that doesn’t require a limited trial that could end in bureaucratic quagmire. The GPO can handle publishing many documents, and any reasonably qualified software engineer could figure out how to deliver them in short order. What’s more, the courts could provide universal free public access today, with zero engineering work: offer a single PACER login that is never billed or, better yet, just stop billing all accounts.

The next round of the PACER debate will be over whether or not we make a fundamental change in access to federal court records, or if we concede minor tweaks and call it a day.