August 18, 2018

Archives for August 2010

Assessing PACER's Access Barriers

The U.S. Courts recently conducted a year-long assessment of their Electronic Public Access program which included a survey of PACER users. While the results of the assessment haven’t been formally published, the Third Branch Newsletter has an interview with Bankruptcy Judge J. Rich Leonard that discusses a few high-level findings of the survey. Judge Leonard has been heavily involved in shaping the evolution of PACER since its inception twenty years ago and continues to lead today.

The survey covered a wide range of PACER users—“the courts, the media, litigants, attorneys, researchers, and bulk data collectors”—and Judge Leonard claims they found “a remarkably high level of satisfaction”: around 80% of those surveyed were “satisfied” or “very satisfied” with the service.

If we compare public access before we had PACER to where we are now, there is clearly much success to celebrate. But the key question is not only whether current users are satisfied with the service but also whether PACER is reaching its entire audience of potential users. Are there artificial obstacles preventing potential PACER users—who admittedly would be difficult to poll—from using the service? The satisfaction statistic may be fine at face value, assuming that a representative sample of users were polled, but it could be misleading if it’s being used to gauge the overall success of PACER as a public access system.

One indicator of obstacles may be another statistic cited by Judge Leonard: “about 45% of PACER users also use CM/ECF,” the Courts’ electronic case management and filing system. To put it another way, nearly half of all PACER users are currently attorneys who practice federal law.

That number seems inordinately high to me and suggests that significant barriers to public access may exist. In particular, account registration requires all users to submit a valid credit card for billing (or alternatively a valid home address to receive log-in credentials and billing statements by mail.) Even if users’ credit cards are never charged, this registration hurdle may already turn away many potential PACER users at the door.

The other barrier is obviously the cost itself. With a few exceptions, users are forced to pay a fee for each document they download, at a metered rate of eight-cents per page. Judge Leonard asserts that “surprisingly, cost ranked way down” in the survey and that “most people thought they paid a fair price for what they got.”

But this doesn’t necessarily imply that cost isn’t a major impediment to access. It may just be that those surveyed—primarily lawyers—simply pass the cost of using PACER down to their clients and never bear the cost themselves. For the rest of PACER users who don’t have that luxury, the high cost of access can completely rule out certain kinds of legal research, or cause users to significantly ration and monitor their usage (as is the case even in the vast majority of our nation’s law libraries), or wholly deter users from ever using the service.

Judge Leonard rightly recognizes that it’s Congress that has authorized the collection of user fees, rather than using general taxpayer money, to fund the electronic public access program. But I wish the Courts would at least acknowledge that moving away from a fee-based model, to a system funded by general appropriations, would strengthen our judicial process and get us closer to securing each citizen’s right to equal protection under the law.

Rather than downplaying the barriers to public access, the Courts should work with Congress to establish a way forward to support a public access system that is truly open. They should study and report on the extent to which Congress already funds PACER indirectly, through Executive and Legislative branch PACER fee payments to the Judiciary, and re-appropriate those funds directly. If there is a funding shortfall, and I assume there will be, they should study the various options for closing that gap, such as additional direct appropriations or a slight increase in certain filing fees.

With our other two branches of government making great strides in openness and transparency with the help of technology, the Courts similarly needs to transition away from a one-size-fits-all approach to information dissemination. Public access to the courts will be fundamentally transformed by a vigorous culture of civic innovation around federal court documents, and this will only happen if the Courts confront today’s access barriers head-on and break them down.

(Thanks to Daniel Schuman for pointing me to the original article.)

Do Not Track: Not as Simple as it Sounds

Over the past few weeks, regulators have rekindled their interest in an online Do Not Track proposal in hopes of better protecting consumer privacy. FTC Chairman Jon Leibowitz told a Senate Commerce subcommittee last month that Do Not Track is “one promising area” for regulatory action and that the Commission plans to issue a report in the fall about “whether this is one viable way to proceed.” Senator Mark Pryor (D-AR), who sits on the subcommittee, is also reportedly drafting a new privacy bill that includes some version of this idea, of empowering consumers with blanket opt-out powers over online tracking.

Details are sparse at this point about how a Do Not Track mechanism might actually be implemented. There are a variety of possible technical and regulatory approaches to the problem, each with its own difficulties and limitations, which I’ll discuss in this post.

An Adaptation of “Do Not Call”

Because of its name, Do Not Track draws immediate comparisons to arguably the most popular piece of consumer protection regulation ever instituted in the US—the National Do Not Call Registry. If the FTC were to take an analogous approach for online tracking, a consumer would register his device’s network identifier—its IP address—with the national registry. Online advertisers would then be prohibited from tracking devices that are identified by those IP addresses.

Of course, consumer devices rarely have persistent long-term IP addresses. Most ISPs assign IP addresses dynamically (using DHCP) and a single device might be assigned a new IP address every few minutes. Consumer devices often also share the same IP address at the same time (using NAT) so there’s no stable one-to-one mapping between IPs and devices. Things could be different with IPv6, where each device could have its own stable IP address, but the Do Not Call framework, directly applied, is not the best solution for today’s online world.

The comparison is still useful though, if only to caution against the assumption that Do Not Track will be as easy, or as successful, as Do Not Call. The differences between the problems at hand and the technologies involved are substantial.

A Registry of Tracking Domains

Back in 2007, a coalition of online consumer privacy groups lobbied for the creation of a national Do Not Track List. They proposed a reverse approach: online advertisers would be required to register with the FTC all domain names used to issue persistent identifiers to user devices. The FTC would then publish this list, and it would be up to the browser to protect users from being tracked by these domains. Notice that the onus here is fully on the browser—equipped with this list—to protect the user from being uniquely identified. Meanwhile, online advertisers would still have free rein to try any method they wish to track user behavior, so long as it happens from these tracking domains.

We’ve learned over the past couple of years that modern browsers, from a practical perspective, can be limited in their ability to protect the user from unique identification. The most stark example of this is the browser fingerprinting attack, which was popularized by the EFF earlier this year. In this attack, the tracking site runs a special script that gathers information about the browser’s configurations, which are unique enough to identify the browser instance in nearly every case. The attack takes advantage of the fact that much of the gathered information is used frequently for legitimate purposes—such as determining which plugins are available to the site—so a browser which blocks the release of this information would surely irritate the user. As these kinds of “side-channel” attacks grow in sophistication, major browser vendors might always be playing catch-up in the technical arms race, leaving most users vulnerable to some form of tracking by these domains.

The x-notrack Header

If we believe that browsers, on their own, will be unable to fully protect users, then any effective Do No Track proposal will need to place some restraints on server tracking behavior. Browsers could send a signal to the tracking server to indicate that the user does not want this particular interaction to be tracked. The signaling mechanism could be in the form of a standard pre-defined cookie field, or more likely, an HTTP header that marks the user’s tracking preference for each connection.

In the simplest case, the HTTP header—call it x-notrack—is a binary flag that can be turned on or off. The browser could enable x-notrack for every HTTP connection, or for connections to only third party sites, or for connections to some set of user-specified sites. Upon receiving the signal not to track, the site would be prevented, by FTC regulation, from setting any persistent identifiers on the user’s machine or using any other side-channel mechanism to uniquely identify the browser and track the interaction.

While this approach seems simple, it could raise a few complicated issues. One issue is bifurcation: nothing would prevent sites from offering limited content or features to users who choose to opt-out of tracking. One could imagine a divided Web, where a user who turns on the x-notrack header for all HTTP connections—i.e. a blanket opt-out—would essentially turn off many of the useful features on the Web.

By being more judicious in the use of x-notrack, a user could permit silos of first-party tracking in exchange for individual feature-rich sites, while limiting widespread tracking by third parties. But many third parties offer useful services, like embedding videos or integrating social media features, and they might require that users disable x-notrack in order to access their services. Users could theoretically make a privacy choice for each third party, but such a reality seems antithetical to the motivations behind Do Not Track: to give consumers an easy mechanism to opt-out of harmful online tracking in one fell swoop.

The FTC could potentially remedy this scenario by including some provision for “tracking neutrality,” which would prohibit sites from unnecessarily discriminating against a user’s choice not to be tracked. I won’t get into the details here, but suffice it to say that crafting a narrow yet effective neutrality provision would be highly contentious.

Privacy Isn’t a Binary Choice

The underlying difficulty in designing a simple Do Not Track mechanism is the subjective nature of privacy. What one user considers harmful tracking might be completely reasonable to another. Privacy isn’t a single binary choice but rather a series of individually-considered decisions that each depend on who the tracking party is, how much information can be combined and what the user gets in return for being tracked. This makes the general concept of online Do Not Track—or any blanket opt-out regime—a fairly awkward fit. Users need simplicity, but whether simple controls can adequately capture the nuances of individual privacy preferences is an open question.

Another open question is whether browser vendors can eventually “win” the technical arms race against tracking technologies. If so, regulations might not be necessary, as innovative browsers could fully insulate users from unwanted tracking. While tracking technologies are currently winning this race, I wouldn’t call it a foregone conclusion.

The one thing we do know is this: Do Not Track is not as simple as it sounds. If regulators are serious about putting forth a proposal, and it sounds like they are, we need to start having a more robust conversation about the merits and ramifications of these issues.

New Search and Browsing Interface for the RECAP Archive

We have written in the past about RECAP, our project to help make federal court documents more easily accessible. We continue to upgrade the system, and we are eager for your feedback on a new set of functionality.

One of the most-requested RECAP features is a better web interface to the archive. Today we’re releasing an experimental system for searching and browsing, at archive.recapthelaw.org. There are also a couple of extra features that we’re eager to get feedback on. For example, you can subscribe to an RSS feed for any case in order to get updates when new documents are added to the archive. We’ve also included some basic tagging features that lets anybody add tags to any case. We’re sure that there will be bugs to be fixed or improvements that can be made. Please let us know.

The first version of the system was built by an enterprising team of students in Professor Ed Felten’s “Civic Technologies” course: Jen King, Brett Lullo, Sajid Mehmood, and Daniel Mattos Roberts. Dhruv Kapadia has done many of the subsequent updates. The links from the Recap Archive pages point to files on our gracious host, the Internet Archive.

See, for example, the RECAP Archive page for United States of America v. Arizona, State of, et al. This is the Arizona District Court case in which the judge last week issued an order granting injunction against several portions of the controversial immigration law. As you can see, some of the documents have a “Download” link that allows you to directly download the document from the Internet Archive, whereas others have a “Buy from PACER” link because no RECAP users have yet liberated the document.