October 18, 2018

Unlocking Hidden Consensus in Legislatures

A legislature is a small group with a big impact. Even for people who will never be part of one, the mechanics of a legislature matter — when they work well, we all benefit, and when they work poorly, we all lose out.

At the same time, with several hundred participants, legislatures are large enough to face some of the same collective action problems and risks that happen in larger groups.

As an example, consider the U.S. Congress. Many members might prefer to reduce agricultural subsidies (a view shared by most Americans, in polls) but few are eager to stick their necks out and attract the concentrated ire of the agricultural lobby. Most Americans care more about other things, and since large agricultural firms lobby and contribute heavily to protect the subsidies, individual members face a high political cost for acting on their own (and their constituents’) views on the issue. As another case in point, nearly all Congressional Republicans have signed Grover Norquist’s Taxpayer Protection Pledge, promising never to raise taxes at all (not even minimally, in exchange for massive spending cuts). But every plausible path to a balanced budget involves a combination of spending cuts and tax increases. Some Republicans may privately want to put a tax increase on the table, as they negotiate with Democrats to balance the budget — in which case there may be majority support for a civically fruitful compromise involving both tax increases and spending cuts. But any one Republican who defects from the pledge will face well-financed retaliation — because, and only because, with few or no defectors, Norquist can afford to respond to each one.

These are classic collective action problems. What if, as in other areas of life, we could build an online platform to help solve them?

[Read more…]

A Possible Constitutional Caveat to SOPA

Tomorrow, a hearing in the House will consider H.R. 3261, the Stop Online Piracy Act (SOPA). There are many frustrating and concerning aspects of this bill. Perhaps most troubling, the current proposed text would undermine the existing safe harbor regime that gives online service providers clear, predictable, and reasonable obligations with respect to their users’ sometime infringing activities. Under SOPA’s new, ambiguous requirements, an online service provider might find itself in court merely for refusing to deploy a new technological surveillance system offered to it by rightholders — because such a refusal could be considered “deliberate action[] to avoid confirming” infringements by its users.

SOPA also incorporates DNS blocking provisions, substantively similar to the Senate’s PROTECT IP Act, that are designed to obstruct Americans’ access to “foreign infringing site[s].” It empowers the Attorney General to require a service provider to “take technically feasible and reasonable measures designed to prevent access by its subscribers located within the United States to the foreign infringing site.” This is a deeply troubling provision — and a stark departure from existing law — in that it would put U.S. Internet Service Providers into the business of obstructing, rather than providing, Americans’ access to the worldwide Internet, and would do so coarsely, since those “reasonable measures” could easily apply to whole sites rather than particular infringing pages or sections.

Intriguingly, a site is a “foreign infringing site” only if — among other criteria — it “would . . . be subject to seizure in the United States in an action brought by the Attorney General if such site were a domestic Internet site.” This is a clear reference to Operation in Our Sites, the ongoing program in which federal officials (acting under a controversial interpretation of current law) “seize” targeted sites by taking control over their Internet domain names, and cause those names to point to sternly-worded warning banners instead of pointing to the targeted site. Because these seizures impact whole sites (rather than just the offending pages or files), and because they occur before the defendant receives notice or an opportunity for an adversary hearing, they raise serious First Amendment concerns. In other words, although many sites have been seized in fact, it is far from clear that any of the sites are validly “subject to seizure” under a correct interpretation of current law. As I wrote in a recent working paper,

When a domain name seizure under the current process is effective, it removes all the content at the targeted domain name, potentially including a significant amount of protected expressive material, from public view—without any court hearing. In other words, these are ex parte seizures of web sites that may contain, especially in the context of music or film blogs where people share opinions about media and offer their own remixes, significant amounts of First Amendment protected expression.

The Supreme Court has held in the obscenity context that “[w]hile a single copy of a book or film may be seized and retained for evidentiary purposes based on a finding of probable cause, the publication may not be taken out of circulation completely until there has been a determination of obscenity after an adversary hearing.” The hearing is needed because the evidentiary burden for restraining speech is greater than the burden for obtaining a warrant: “[M]ere probable cause to believe a legal violation has transpired is not adequate to remove books or films from circulation.” In that case, the speech at issue was alleged to be obscene, and hence unprotected by the First Amendment, but the Court held that the unlawfulness of the speech needed to be confirmed through an adversary hearing, before the speech could be suppressed.

ICE does make some ex parte effort, before a seizure, to verify that the websites it targets are engaged in criminal IPR infringement: As the agency’s director testified, “[f]or each domain name seized, ICE investigators independently obtained counterfeit trademarked goods or pirated copyrighted material that was in turn verified by the rights holders as counterfeit.”

Domain seizures might be distinguished from these earlier cases because “it is only the property interest in a domain name that is being seized here, not the content of the web site itself or the servers that the content resides on.” But the gravamen of the Supreme Court’s concern in past cases has been the amount of expression suppressed by a seizure, rather than the sheer quantity of items seized.

The operators of one targeted site are currently challenging an In Our Sites seizure, on First Amendment and other grounds, in an appeal before the Second Circuit. (An amicus brief submitted by EFF, CDT and Public Knowledge makes for excellent background reading.) The district court ruling below apparently turned on the defendant’s ability to demonstrate financial harm (rather than on the First Amendment issues), so it is possible that the Court may not reach the First Amendment issue. But it is also possible that the Second Circuit may take a dim view of the constitutionality of the In Our Sites seizures — and by extension, a dim view of the Attorney General’s constitutional power to “subject [domestic web sites] to seizure.” In that event, the scope of this portion of SOPA may turn out to be much narrower than its authors intend.

Identifying John Doe: It might be easier than you think

Imagine that you want to sue someone for what they wrote, anonymously, in a web-based online forum. To succeed, you’ll first have to figure out who they really are. How hard is that task? It’s a question that Harlan Yu, Ed Felten, and I have been kicking around for several months. We’ve come to some tentative answers that surprised us, and that may surprise you.

Until recently, I thought the picture was very grim for would-be plaintiffs, writing that it should be simple for “even a non-technical Internet user to engage in effectively untraceable speech online.” I still think it’s feasible for most users, if they make enough effort, to remain anonymous despite any level of scrutiny they are practically likely to face. But in recent months, as Harlan, Ed, and I have discussed this issue, we’ve started to see a flip side to the coin: In many situations, it may be far easier to unmask apparently anonymous online speakers than they, I, or many others in the policy community have appreciated. Today, I’ll tell a story that helps explain what I mean.

Anonymous online speech is a mixed bag: it includes some high value speech such as political dissent in repressive regimes, some dreck we happily tolerate on First Amendment grounds, and some material that violates the laws of many jurisdictions, including child pornography and defamatory speech. For purposes of this discussion, let’s focus on cases like the recent AutoAdmit controversy, in which a plaintiff wishes to bring a defamation suit against an anonymous or pseudonymous poster to a web based discussion forum. I’ll assume, as in the AutoAdmit suit, that the plaintiff has at least a facially plausible legal claim, so that if everyone’s identity were clear, it would also be clear that the plaintiff would have the legal option to bring a defamation suit. In the online context, these are usually what’s called “John Doe” suits, because the plaintiff’s lawyer does not know the name of the defendant in the suit, and must use “John Doe” as a stand in name for the defendant. After filing a John Doe suit, the plaintiff’s lawyer can use subpoenas to force third parties to reveal information that might help identify the John Doe defendant.

In situations like these, if a plaintiff’s lawyer cannot otherwise determine who the poster is, the lawyer will typically subpoena the forum web site, seeking the IP address of the anonymous poster. Many widely used web based discussion systems, including for example the popular WordPress blogging platform, routinely log the IP addresses of commenters. If the web site is able to provide an IP address for the source of the allegedly defamatory comment, the lawyer will do a reverse lookup, a WHOIS search, or both, on that IP address, hoping to discover that the IP address belongs to a residential ISP or another organization that maintains detailed information about its individual users. If the IP address does turn out to correspond to a residential ISP — rather than, say, to an open wifi hub at a coffee shop or library — then the lawyer will issue a second subpoena, asking the ISP to reveal the account details of the user who was using that IP address at the time it was used to transmit the potentially defamatory comment. This is known as a “subpoena chain” because it involves two subpoenas (one to the web site, and a second one, based on the results of the first, to the ISP).

Of course, in many cases, this method won’t work. The forum web site may not have logged the commenter’s IP address. Or, even if an address is available, it might not be readily traceable back to an ISP account: the anonymous commenter may been using an anonymization tool like Tor to hide his address. Or he may have been coming online from a coffee shop or similarly public place (which typically will not have logged information about its transient users). Or, even if he reached the web forum directly from his own ISP, that ISP might be located in a foreign jurisdiction, beyond the reach of an American lawyer’s usual legal tools.

Is this a dead end for the plaintiff’s lawyer, who wants to identify John Doe? Probably not. There are a range of other parties, not yet part of our story, who might have information that could help identify John Doe. When it comes to the AutoAdmit site, one of these parties is StatCounter.com, a web traffic measurement service that AutoAdmit uses to keep track of trends in its traffic over time.

At the moment I am writing this post, anyone can verify that AutoAdmit uses StatCounter by visiting AutoAdmit.com and choosing “View Source” from the web browser menu. The first screenfull of web page code that comes up includes a block of text helpfully labeled “StatCounter Code,” which in turn runs a small piece of javascript that places a personalized StatCounter cookie on the machine of every user who visits AutoAdmit, or else (if one is already present) detects and records exactly which cookie it is. That’s how StatCounter can tell which visitors to AutoAdmit.com are new, which ones are returning, and which pages on the site are of greatest interest to new and returning users. StatCounter is in a position to track not only each user, but also each page, and each visit by a user to a certain page, over time. This includes not only the home page, but also the particular web page for each discussion “thread” on the site. Moreover, each post (even if anonymous) is marked with the time it was posted, down to the minute. So the plaintiff’s lawyer in our story could go to StatCounter, and ask only about visits to the particular thread where the relevant message was posted. If the post went up at 6:03 p.m. on a certain date, the lawyer could ask StatCounter, “What if anything do you know about the person who visited this web page at 6:03 p.m. on this date?” Of course, if John Doe’s browser is configured to refuse cookies, he wouldn’t be trackable. But most web based discussion sites, including AutoAdmit, rely on cookies to let people log in to their pseudonymous accounts in order to post comments in the first place. In any case, the web is much less convenient place without cookies, and as a practical matter most users do allow them.

In fact, the lawyer may be able to do better still: The anonymous commenter will have accessed the page at least twice — once to view the discussion as it stood before he took part, and again after clicking the button to add his own post to the mix. If StatCounter recorded both visits, as it very likely would have, then it becomes even easier to tie the anonymous commenter to his StatCounter cookie (and to whatever browsing history StatCounter has associated with that cookie).

There are a huge number of things to discuss here, and we’ll tackle several in the coming days. What would a web analytics provider like StatCounter know? Likely answers include IP addresses, times, and durations for the anonymous commenter’s previous visits to AutoAdmit. What about other, similar services, used by other sites? What about “beacons” that simply and silently collect data about users, and pay webmasters for the privilege? What about behavioral advertisers, whose business model involves tracking users across multiple sites and developing knowledge of their browsing habits and interests? What about content distribution networks? How would this picture change if John Doe were taking affirmative steps, such as using Tor, to obfuscate his identity?

These are some of the questions that we’ll try to address in future posts.