April 19, 2024

Archives for August 2010

The Future of DRE Voting Machines

Last week at the EVT/WOTE workshop, Ari Feldman and I unveiled a new research project that we feel represents the future of DRE voting machines. DRE (direct-recording electronic) voting machines are ones where voters cast their ballots by pressing buttons or using a touch screen, and the primary record of the votes is stored in a computer memory. Numerous scientific studies have demonstrated that such machines can be reprogrammed to steal votes, so when we got our hands on a DRE called the Sequoia AVC Edge, we decided to do something different: we reprogrammed it to run Pac-Man.

As more states move away from using insecure DREs, there’s a risk that thousands of these machines will clog our landfills. Fortunately, our results show that they can be productively repurposed. We believe that in the not-so-distant future, recycled DREs will provide countless hours of entertainment in the basements of the nation’s nerds.

To see how we did it, visit our Pac-Man on the AVC Edge voting machine site.

Assessing PACER's Access Barriers

The U.S. Courts recently conducted a year-long assessment of their Electronic Public Access program which included a survey of PACER users. While the results of the assessment haven’t been formally published, the Third Branch Newsletter has an interview with Bankruptcy Judge J. Rich Leonard that discusses a few high-level findings of the survey. Judge Leonard has been heavily involved in shaping the evolution of PACER since its inception twenty years ago and continues to lead today.

The survey covered a wide range of PACER users—“the courts, the media, litigants, attorneys, researchers, and bulk data collectors”—and Judge Leonard claims they found “a remarkably high level of satisfaction”: around 80% of those surveyed were “satisfied” or “very satisfied” with the service.

If we compare public access before we had PACER to where we are now, there is clearly much success to celebrate. But the key question is not only whether current users are satisfied with the service but also whether PACER is reaching its entire audience of potential users. Are there artificial obstacles preventing potential PACER users—who admittedly would be difficult to poll—from using the service? The satisfaction statistic may be fine at face value, assuming that a representative sample of users were polled, but it could be misleading if it’s being used to gauge the overall success of PACER as a public access system.

One indicator of obstacles may be another statistic cited by Judge Leonard: “about 45% of PACER users also use CM/ECF,” the Courts’ electronic case management and filing system. To put it another way, nearly half of all PACER users are currently attorneys who practice federal law.

That number seems inordinately high to me and suggests that significant barriers to public access may exist. In particular, account registration requires all users to submit a valid credit card for billing (or alternatively a valid home address to receive log-in credentials and billing statements by mail.) Even if users’ credit cards are never charged, this registration hurdle may already turn away many potential PACER users at the door.

The other barrier is obviously the cost itself. With a few exceptions, users are forced to pay a fee for each document they download, at a metered rate of eight-cents per page. Judge Leonard asserts that “surprisingly, cost ranked way down” in the survey and that “most people thought they paid a fair price for what they got.”

But this doesn’t necessarily imply that cost isn’t a major impediment to access. It may just be that those surveyed—primarily lawyers—simply pass the cost of using PACER down to their clients and never bear the cost themselves. For the rest of PACER users who don’t have that luxury, the high cost of access can completely rule out certain kinds of legal research, or cause users to significantly ration and monitor their usage (as is the case even in the vast majority of our nation’s law libraries), or wholly deter users from ever using the service.

Judge Leonard rightly recognizes that it’s Congress that has authorized the collection of user fees, rather than using general taxpayer money, to fund the electronic public access program. But I wish the Courts would at least acknowledge that moving away from a fee-based model, to a system funded by general appropriations, would strengthen our judicial process and get us closer to securing each citizen’s right to equal protection under the law.

Rather than downplaying the barriers to public access, the Courts should work with Congress to establish a way forward to support a public access system that is truly open. They should study and report on the extent to which Congress already funds PACER indirectly, through Executive and Legislative branch PACER fee payments to the Judiciary, and re-appropriate those funds directly. If there is a funding shortfall, and I assume there will be, they should study the various options for closing that gap, such as additional direct appropriations or a slight increase in certain filing fees.

With our other two branches of government making great strides in openness and transparency with the help of technology, the Courts similarly needs to transition away from a one-size-fits-all approach to information dissemination. Public access to the courts will be fundamentally transformed by a vigorous culture of civic innovation around federal court documents, and this will only happen if the Courts confront today’s access barriers head-on and break them down.

(Thanks to Daniel Schuman for pointing me to the original article.)

Do Not Track: Not as Simple as it Sounds

Over the past few weeks, regulators have rekindled their interest in an online Do Not Track proposal in hopes of better protecting consumer privacy. FTC Chairman Jon Leibowitz told a Senate Commerce subcommittee last month that Do Not Track is “one promising area” for regulatory action and that the Commission plans to issue a report in the fall about “whether this is one viable way to proceed.” Senator Mark Pryor (D-AR), who sits on the subcommittee, is also reportedly drafting a new privacy bill that includes some version of this idea, of empowering consumers with blanket opt-out powers over online tracking.

Details are sparse at this point about how a Do Not Track mechanism might actually be implemented. There are a variety of possible technical and regulatory approaches to the problem, each with its own difficulties and limitations, which I’ll discuss in this post.

An Adaptation of “Do Not Call”

Because of its name, Do Not Track draws immediate comparisons to arguably the most popular piece of consumer protection regulation ever instituted in the US—the National Do Not Call Registry. If the FTC were to take an analogous approach for online tracking, a consumer would register his device’s network identifier—its IP address—with the national registry. Online advertisers would then be prohibited from tracking devices that are identified by those IP addresses.

Of course, consumer devices rarely have persistent long-term IP addresses. Most ISPs assign IP addresses dynamically (using DHCP) and a single device might be assigned a new IP address every few minutes. Consumer devices often also share the same IP address at the same time (using NAT) so there’s no stable one-to-one mapping between IPs and devices. Things could be different with IPv6, where each device could have its own stable IP address, but the Do Not Call framework, directly applied, is not the best solution for today’s online world.

The comparison is still useful though, if only to caution against the assumption that Do Not Track will be as easy, or as successful, as Do Not Call. The differences between the problems at hand and the technologies involved are substantial.

A Registry of Tracking Domains

Back in 2007, a coalition of online consumer privacy groups lobbied for the creation of a national Do Not Track List. They proposed a reverse approach: online advertisers would be required to register with the FTC all domain names used to issue persistent identifiers to user devices. The FTC would then publish this list, and it would be up to the browser to protect users from being tracked by these domains. Notice that the onus here is fully on the browser—equipped with this list—to protect the user from being uniquely identified. Meanwhile, online advertisers would still have free rein to try any method they wish to track user behavior, so long as it happens from these tracking domains.

We’ve learned over the past couple of years that modern browsers, from a practical perspective, can be limited in their ability to protect the user from unique identification. The most stark example of this is the browser fingerprinting attack, which was popularized by the EFF earlier this year. In this attack, the tracking site runs a special script that gathers information about the browser’s configurations, which are unique enough to identify the browser instance in nearly every case. The attack takes advantage of the fact that much of the gathered information is used frequently for legitimate purposes—such as determining which plugins are available to the site—so a browser which blocks the release of this information would surely irritate the user. As these kinds of “side-channel” attacks grow in sophistication, major browser vendors might always be playing catch-up in the technical arms race, leaving most users vulnerable to some form of tracking by these domains.

The x-notrack Header

If we believe that browsers, on their own, will be unable to fully protect users, then any effective Do No Track proposal will need to place some restraints on server tracking behavior. Browsers could send a signal to the tracking server to indicate that the user does not want this particular interaction to be tracked. The signaling mechanism could be in the form of a standard pre-defined cookie field, or more likely, an HTTP header that marks the user’s tracking preference for each connection.

In the simplest case, the HTTP header—call it x-notrack—is a binary flag that can be turned on or off. The browser could enable x-notrack for every HTTP connection, or for connections to only third party sites, or for connections to some set of user-specified sites. Upon receiving the signal not to track, the site would be prevented, by FTC regulation, from setting any persistent identifiers on the user’s machine or using any other side-channel mechanism to uniquely identify the browser and track the interaction.

While this approach seems simple, it could raise a few complicated issues. One issue is bifurcation: nothing would prevent sites from offering limited content or features to users who choose to opt-out of tracking. One could imagine a divided Web, where a user who turns on the x-notrack header for all HTTP connections—i.e. a blanket opt-out—would essentially turn off many of the useful features on the Web.

By being more judicious in the use of x-notrack, a user could permit silos of first-party tracking in exchange for individual feature-rich sites, while limiting widespread tracking by third parties. But many third parties offer useful services, like embedding videos or integrating social media features, and they might require that users disable x-notrack in order to access their services. Users could theoretically make a privacy choice for each third party, but such a reality seems antithetical to the motivations behind Do Not Track: to give consumers an easy mechanism to opt-out of harmful online tracking in one fell swoop.

The FTC could potentially remedy this scenario by including some provision for “tracking neutrality,” which would prohibit sites from unnecessarily discriminating against a user’s choice not to be tracked. I won’t get into the details here, but suffice it to say that crafting a narrow yet effective neutrality provision would be highly contentious.

Privacy Isn’t a Binary Choice

The underlying difficulty in designing a simple Do Not Track mechanism is the subjective nature of privacy. What one user considers harmful tracking might be completely reasonable to another. Privacy isn’t a single binary choice but rather a series of individually-considered decisions that each depend on who the tracking party is, how much information can be combined and what the user gets in return for being tracked. This makes the general concept of online Do Not Track—or any blanket opt-out regime—a fairly awkward fit. Users need simplicity, but whether simple controls can adequately capture the nuances of individual privacy preferences is an open question.

Another open question is whether browser vendors can eventually “win” the technical arms race against tracking technologies. If so, regulations might not be necessary, as innovative browsers could fully insulate users from unwanted tracking. While tracking technologies are currently winning this race, I wouldn’t call it a foregone conclusion.

The one thing we do know is this: Do Not Track is not as simple as it sounds. If regulators are serious about putting forth a proposal, and it sounds like they are, we need to start having a more robust conversation about the merits and ramifications of these issues.

New Search and Browsing Interface for the RECAP Archive

We have written in the past about RECAP, our project to help make federal court documents more easily accessible. We continue to upgrade the system, and we are eager for your feedback on a new set of functionality.

One of the most-requested RECAP features is a better web interface to the archive. Today we’re releasing an experimental system for searching and browsing, at archive.recapthelaw.org. There are also a couple of extra features that we’re eager to get feedback on. For example, you can subscribe to an RSS feed for any case in order to get updates when new documents are added to the archive. We’ve also included some basic tagging features that lets anybody add tags to any case. We’re sure that there will be bugs to be fixed or improvements that can be made. Please let us know.

The first version of the system was built by an enterprising team of students in Professor Ed Felten’s “Civic Technologies” course: Jen King, Brett Lullo, Sajid Mehmood, and Daniel Mattos Roberts. Dhruv Kapadia has done many of the subsequent updates. The links from the Recap Archive pages point to files on our gracious host, the Internet Archive.

See, for example, the RECAP Archive page for United States of America v. Arizona, State of, et al. This is the Arizona District Court case in which the judge last week issued an order granting injunction against several portions of the controversial immigration law. As you can see, some of the documents have a “Download” link that allows you to directly download the document from the Internet Archive, whereas others have a “Buy from PACER” link because no RECAP users have yet liberated the document.

A Major Internet Milestone: DNSSEC and SSL

On July 15th, a small but significant internet event occurred. On that day, years of planning culminated in the deployment of a cryptographic signature on the root DNS zone. To simplify greatly, this means that internet users will soon be able to have a much higher degree of trust in the hierarchical Domain Name System by utilizing the powers of recursion and cryptography. When a user’s computer is told that the IP address for “gmail.com” is 72.14.204.19, the user can be sure that this answer is true. This is important if you are someone such as a Chinese dissident who wants to reliably and securely reach gmail.com in order to communicate with your peers. The rollout of this throughout all domains, DNS resolvers, and client applications will take a little while, but the basic infrastructure is now in place.

This mitigates a certain class of vulnerabilities that web users used to face. Although it forecloses attacks at the domain name-to-IP address stage of requesting a web page, it does not necessarily foreclose attacks at other stages. For instance, an attacker that gets between you and the server you are trying to reach can simply claim that he is the server at 72.14.204.19. Our traditional way of protecting against this style of attack has been to rely on Certificate Authorities — trusted third-parties who certify digital key-pairs only for the true owners of a given domain name. Thus, even if an attacker tries to execute one of these “man-in-the-middle” attacks, he won’t possess the secret portion of the digital key-pair that is required to prove that his communications come from the true gmail.com. Your browser checks for a certified corresponding public key in the process of setting up a secure SSL/TLS connection to https://gmail.com.

Unfortunately, there are several technical, operational, and jurisdictional shortcomings of the Certificate Authority model. As I discussed in an earlier post, many of these problems are not present in the hierarchical and delegated model of DNS. However, DNS does not inherently provide the ability to store domain name-to-key-pair information. But could it? At one of the recent DNSSEC deployment ceremonies, Vint Cerf noted:

More has happened here today than meets the eye. An infrastructure has been created for a hierarchical security system, which can be purposed and re-purposed in a number of different ways. And so I would predict that although we started out putting this system together to assure that the domain name lookups return valid internet addresses, that in the long run this hierarchical structure of trust will be applied to a number of other functions that require strong authentication. And so you will have seen a new major milestone in the internet story.

I believe that storing SSL/TLS keys in DNSSEC-secured DNS records will be the first significant “other function” that will emerge. An alternative to Certificate Authorities for domain-to-key mapping is sorely needed. There are two major practical hurdles to getting there: 1) We must define a standard for placing keys in DNS and 2) We must secure the “last mile” from the service provder’s DNS resolver to the end-user’s computer.

The first hurdle involves the type of standard-setting that the internet community is quite familiar with. On a technical level, it means that we need to collectively decide what these DNS records look like. The second hurdle involves building more functionality into end users’ software so that it can do cryptographic validation of DNS results rather than blindly trusting its upstream DNS resolver. There may be temporary ways to do this within web browser code, but ultimately it will probably have to be built into what is called the “stub resolver” — a local service running on your computer that usually just asks for the results from the upstream resolver.

It is important to note that none of his makes Certificate Authorities obsolete. Although the DNS-based approach replaces the most basic type of SSL certificates, the Certificate Authorities will continue to be the only entities that can offer validation of real-world identity of site owners. The DNS-based approach and basic “domain validated” Certificate Authority certificates both verify only that whoever controls the domain name is the entity that your computer is communicating with, without saying who that is. In recent years, “Extended Validation” certificates (the ones that make your browser bar glow green) have begun to be offered by all major certificate authorities. These certificates require more rigorous validation of the identity of the owner, so that for example you know that the person who controls bankofamerica.com is really Bank of America Corporation.

At this year’s Black Hat and Defcon, Dan Kaminsky demonstrated some new software he is releasing that could make deploying DNSSEC more easy in general, and that could also address the two main hurdles to placing keys in DNS. He readily admits that his particular implementation is not perfect, and has encouraged critiques and changes. [Update: His slides are available here.]

Hopefully, with the input of the many smart folks in the security, internet standards, and software development communities, we will see a production-quality DNSSEC-secured solution to domain-to-key authentication in the near future.