April 18, 2014

David Robinson

I'm a principal at Robinson & Yu LLC, a consulting firm that helps people harness Internet technologies to tackle civic problems. I'm also a Visiting Fellow of the Information Society Project at Yale Law School, where I research these issues in an academic context.

avatar

Just launched — Equal Future: Dispatches on Social Justice & Technology

Hello, Freedom to Tinker readers! I’m writing to introduce a new resource that may be of interest to you. It’s called Equal Future, and is written by Robinson + Yu with the support of the Ford Foundation.
[Read more...]

avatar

Unlocking Hidden Consensus in Legislatures

A legislature is a small group with a big impact. Even for people who will never be part of one, the mechanics of a legislature matter — when they work well, we all benefit, and when they work poorly, we all lose out.

At the same time, with several hundred participants, legislatures are large enough to face some of the same collective action problems and risks that happen in larger groups.

As an example, consider the U.S. Congress. Many members might prefer to reduce agricultural subsidies (a view shared by most Americans, in polls) but few are eager to stick their necks out and attract the concentrated ire of the agricultural lobby. Most Americans care more about other things, and since large agricultural firms lobby and contribute heavily to protect the subsidies, individual members face a high political cost for acting on their own (and their constituents’) views on the issue. As another case in point, nearly all Congressional Republicans have signed Grover Norquist’s Taxpayer Protection Pledge, promising never to raise taxes at all (not even minimally, in exchange for massive spending cuts). But every plausible path to a balanced budget involves a combination of spending cuts and tax increases. Some Republicans may privately want to put a tax increase on the table, as they negotiate with Democrats to balance the budget — in which case there may be majority support for a civically fruitful compromise involving both tax increases and spending cuts. But any one Republican who defects from the pledge will face well-financed retaliation — because, and only because, with few or no defectors, Norquist can afford to respond to each one.

These are classic collective action problems. What if, as in other areas of life, we could build an online platform to help solve them?

[Read more...]

avatar

A Possible Constitutional Caveat to SOPA

Tomorrow, a hearing in the House will consider H.R. 3261, the Stop Online Piracy Act (SOPA). There are many frustrating and concerning aspects of this bill. Perhaps most troubling, the current proposed text would undermine the existing safe harbor regime that gives online service providers clear, predictable, and reasonable obligations with respect to their users’ sometime infringing activities. Under SOPA’s new, ambiguous requirements, an online service provider might find itself in court merely for refusing to deploy a new technological surveillance system offered to it by rightholders — because such a refusal could be considered “deliberate action[] to avoid confirming” infringements by its users.

SOPA also incorporates DNS blocking provisions, substantively similar to the Senate’s PROTECT IP Act, that are designed to obstruct Americans’ access to “foreign infringing site[s].” It empowers the Attorney General to require a service provider to “take technically feasible and reasonable measures designed to prevent access by its subscribers located within the United States to the foreign infringing site.” This is a deeply troubling provision — and a stark departure from existing law — in that it would put U.S. Internet Service Providers into the business of obstructing, rather than providing, Americans’ access to the worldwide Internet, and would do so coarsely, since those “reasonable measures” could easily apply to whole sites rather than particular infringing pages or sections.

Intriguingly, a site is a “foreign infringing site” only if — among other criteria — it “would . . . be subject to seizure in the United States in an action brought by the Attorney General if such site were a domestic Internet site.” This is a clear reference to Operation in Our Sites, the ongoing program in which federal officials (acting under a controversial interpretation of current law) “seize” targeted sites by taking control over their Internet domain names, and cause those names to point to sternly-worded warning banners instead of pointing to the targeted site. Because these seizures impact whole sites (rather than just the offending pages or files), and because they occur before the defendant receives notice or an opportunity for an adversary hearing, they raise serious First Amendment concerns. In other words, although many sites have been seized in fact, it is far from clear that any of the sites are validly “subject to seizure” under a correct interpretation of current law. As I wrote in a recent working paper,

When a domain name seizure under the current process is effective, it removes all the content at the targeted domain name, potentially including a significant amount of protected expressive material, from public view—without any court hearing. In other words, these are ex parte seizures of web sites that may contain, especially in the context of music or film blogs where people share opinions about media and offer their own remixes, significant amounts of First Amendment protected expression.

The Supreme Court has held in the obscenity context that “[w]hile a single copy of a book or film may be seized and retained for evidentiary purposes based on a finding of probable cause, the publication may not be taken out of circulation completely until there has been a determination of obscenity after an adversary hearing.” The hearing is needed because the evidentiary burden for restraining speech is greater than the burden for obtaining a warrant: “[M]ere probable cause to believe a legal violation has transpired is not adequate to remove books or films from circulation.” In that case, the speech at issue was alleged to be obscene, and hence unprotected by the First Amendment, but the Court held that the unlawfulness of the speech needed to be confirmed through an adversary hearing, before the speech could be suppressed.

ICE does make some ex parte effort, before a seizure, to verify that the websites it targets are engaged in criminal IPR infringement: As the agency’s director testified, “[f]or each domain name seized, ICE investigators independently obtained counterfeit trademarked goods or pirated copyrighted material that was in turn verified by the rights holders as counterfeit.”

Domain seizures might be distinguished from these earlier cases because “it is only the property interest in a domain name that is being seized here, not the content of the web site itself or the servers that the content resides on.” But the gravamen of the Supreme Court’s concern in past cases has been the amount of expression suppressed by a seizure, rather than the sheer quantity of items seized.

The operators of one targeted site are currently challenging an In Our Sites seizure, on First Amendment and other grounds, in an appeal before the Second Circuit. (An amicus brief submitted by EFF, CDT and Public Knowledge makes for excellent background reading.) The district court ruling below apparently turned on the defendant’s ability to demonstrate financial harm (rather than on the First Amendment issues), so it is possible that the Court may not reach the First Amendment issue. But it is also possible that the Second Circuit may take a dim view of the constitutionality of the In Our Sites seizures — and by extension, a dim view of the Attorney General’s constitutional power to “subject [domestic web sites] to seizure.” In that event, the scope of this portion of SOPA may turn out to be much narrower than its authors intend.

avatar

Identifying John Doe: It might be easier than you think

Imagine that you want to sue someone for what they wrote, anonymously, in a web-based online forum. To succeed, you’ll first have to figure out who they really are. How hard is that task? It’s a question that Harlan Yu, Ed Felten, and I have been kicking around for several months. We’ve come to some tentative answers that surprised us, and that may surprise you.

Until recently, I thought the picture was very grim for would-be plaintiffs, writing that it should be simple for “even a non-technical Internet user to engage in effectively untraceable speech online.” I still think it’s feasible for most users, if they make enough effort, to remain anonymous despite any level of scrutiny they are practically likely to face. But in recent months, as Harlan, Ed, and I have discussed this issue, we’ve started to see a flip side to the coin: In many situations, it may be far easier to unmask apparently anonymous online speakers than they, I, or many others in the policy community have appreciated. Today, I’ll tell a story that helps explain what I mean.

Anonymous online speech is a mixed bag: it includes some high value speech such as political dissent in repressive regimes, some dreck we happily tolerate on First Amendment grounds, and some material that violates the laws of many jurisdictions, including child pornography and defamatory speech. For purposes of this discussion, let’s focus on cases like the recent AutoAdmit controversy, in which a plaintiff wishes to bring a defamation suit against an anonymous or pseudonymous poster to a web based discussion forum. I’ll assume, as in the AutoAdmit suit, that the plaintiff has at least a facially plausible legal claim, so that if everyone’s identity were clear, it would also be clear that the plaintiff would have the legal option to bring a defamation suit. In the online context, these are usually what’s called “John Doe” suits, because the plaintiff’s lawyer does not know the name of the defendant in the suit, and must use “John Doe” as a stand in name for the defendant. After filing a John Doe suit, the plaintiff’s lawyer can use subpoenas to force third parties to reveal information that might help identify the John Doe defendant.

In situations like these, if a plaintiff’s lawyer cannot otherwise determine who the poster is, the lawyer will typically subpoena the forum web site, seeking the IP address of the anonymous poster. Many widely used web based discussion systems, including for example the popular WordPress blogging platform, routinely log the IP addresses of commenters. If the web site is able to provide an IP address for the source of the allegedly defamatory comment, the lawyer will do a reverse lookup, a WHOIS search, or both, on that IP address, hoping to discover that the IP address belongs to a residential ISP or another organization that maintains detailed information about its individual users. If the IP address does turn out to correspond to a residential ISP — rather than, say, to an open wifi hub at a coffee shop or library — then the lawyer will issue a second subpoena, asking the ISP to reveal the account details of the user who was using that IP address at the time it was used to transmit the potentially defamatory comment. This is known as a “subpoena chain” because it involves two subpoenas (one to the web site, and a second one, based on the results of the first, to the ISP).

Of course, in many cases, this method won’t work. The forum web site may not have logged the commenter’s IP address. Or, even if an address is available, it might not be readily traceable back to an ISP account: the anonymous commenter may been using an anonymization tool like Tor to hide his address. Or he may have been coming online from a coffee shop or similarly public place (which typically will not have logged information about its transient users). Or, even if he reached the web forum directly from his own ISP, that ISP might be located in a foreign jurisdiction, beyond the reach of an American lawyer’s usual legal tools.

Is this a dead end for the plaintiff’s lawyer, who wants to identify John Doe? Probably not. There are a range of other parties, not yet part of our story, who might have information that could help identify John Doe. When it comes to the AutoAdmit site, one of these parties is StatCounter.com, a web traffic measurement service that AutoAdmit uses to keep track of trends in its traffic over time.

At the moment I am writing this post, anyone can verify that AutoAdmit uses StatCounter by visiting AutoAdmit.com and choosing “View Source” from the web browser menu. The first screenfull of web page code that comes up includes a block of text helpfully labeled “StatCounter Code,” which in turn runs a small piece of javascript that places a personalized StatCounter cookie on the machine of every user who visits AutoAdmit, or else (if one is already present) detects and records exactly which cookie it is. That’s how StatCounter can tell which visitors to AutoAdmit.com are new, which ones are returning, and which pages on the site are of greatest interest to new and returning users. StatCounter is in a position to track not only each user, but also each page, and each visit by a user to a certain page, over time. This includes not only the home page, but also the particular web page for each discussion “thread” on the site. Moreover, each post (even if anonymous) is marked with the time it was posted, down to the minute. So the plaintiff’s lawyer in our story could go to StatCounter, and ask only about visits to the particular thread where the relevant message was posted. If the post went up at 6:03 p.m. on a certain date, the lawyer could ask StatCounter, “What if anything do you know about the person who visited this web page at 6:03 p.m. on this date?” Of course, if John Doe’s browser is configured to refuse cookies, he wouldn’t be trackable. But most web based discussion sites, including AutoAdmit, rely on cookies to let people log in to their pseudonymous accounts in order to post comments in the first place. In any case, the web is much less convenient place without cookies, and as a practical matter most users do allow them.

In fact, the lawyer may be able to do better still: The anonymous commenter will have accessed the page at least twice — once to view the discussion as it stood before he took part, and again after clicking the button to add his own post to the mix. If StatCounter recorded both visits, as it very likely would have, then it becomes even easier to tie the anonymous commenter to his StatCounter cookie (and to whatever browsing history StatCounter has associated with that cookie).

There are a huge number of things to discuss here, and we’ll tackle several in the coming days. What would a web analytics provider like StatCounter know? Likely answers include IP addresses, times, and durations for the anonymous commenter’s previous visits to AutoAdmit. What about other, similar services, used by other sites? What about “beacons” that simply and silently collect data about users, and pay webmasters for the privilege? What about behavioral advertisers, whose business model involves tracking users across multiple sites and developing knowledge of their browsing habits and interests? What about content distribution networks? How would this picture change if John Doe were taking affirmative steps, such as using Tor, to obfuscate his identity?

These are some of the questions that we’ll try to address in future posts.

avatar

The Markey Net Neutrality Bill: Least Restrictive Network Management?

It’s an exciting time in the net neutrality debate. FCC Chairman Jules Genachowski’s speech on Monday promised a new FCC proceeding that will aim to create a formal rule to replace the Commission’s existing policy statement.

Meanwhile, net neutrality advocates in Congress are pondering new legislation for two reasons: First, there is a debate about whether the FCC currently has enough authority to enforce a net neutrality rule. Second, regardless of whether the Commission has such authority today or doesn’t, some would rather see net neutrality rules etched into statute than leave them to the uncertainties of the rulemaking process under this and future Commissions.

One legislative proposal comes from Rep. Ed Markey and colleagues. Called the Internet Freedom Preservation Act of 2009, its current draft is available on the Free Press web site.

I favor the broad goals that motivate this bill — an Internet that remains friendly to innovation and broadly available. But I personally believe the current draft of this bill would be a mistake, because it embodies a very optimistic view of the FCC’s ability to wield regulatory authority and avoid regulatory capture, not only under the current administration but also over the long-run future. It puts a huge amount of statutory weight behind the vague-till-now idea of “reasonable network management” — something that the FCC’s policy statement (and many participants in the debate) have said ISPs should be permitted to do, but whose meaning remains unsettled. Indeed, Ed raised questions back in 2006 about just how hard it might be to decide what this phrase should mean.

The section of the Markey bill that would be labeled as section 12 (d) in statute says that a network management practice

. . . is a reasonable practice only if it furthers a critically important interest, is narrowly tailored to further that interest, and is the means of furthering that interest that is the least restrictive, least discriminatory, and least constricting of consumer choice available.

This language — particularly the trio of “leasts” — puts the FCC in a position to intervene if, in the Commission’s judgment, any alternative course of action would have been better for consumers than the one an ISP actually took. Normally, to call something “reasonable” means that it is within the broad range of possibilities that might make sense to an imagined “reasonable person.” This bill’s definition of “reasonable” is very different, since on its terms there is no scope for discretion within reasonableness — the single best option is the only one deemed reasonable by the statute.

The bill’s language may sound familiar — it is a modified form of the judicial “strict scrutiny” standard the courts use to review government action when the state uses a suspect classification (such as race) or burdens a fundamental right (such as free speech in certain contexts). In those cases, the question is whether or not a “compelling governmental interest” justifies the policy under review. Here, however, it’s not totally clear whose interest, in what, must be compelling in order for a given network management practice to count as reasonable. We are discussing the actions of ISPs, who are generally public companies– do their interests in profit maximization count as compelling? Shareholders certainly think so. What about their interests in R&D? Or, does the statute mean to single out the public’s interest in the general goods outlined in section 12 (a), such as “protect[ing] the open and interconnected nature of broadband networks” ?

I fear the bill would spur a food fight among ISPs, each of whom could complain about what the others were doing. Such a battle would raise the probability that those ISPs with the most effective lobbying shops will prevail over those with the most attractive offerings for consumers, if and when the two diverge.

Why use the phrase “reasonable network management” to describe this exacting standard? I think the most likely answer is simply that many participants in the net neutrality debate use the phrase as a shorthand term for whatever should be allowed — so that “reasonable” turns out to mean “permitted.”

There is also an interesting secondary conversation to be had here about whether it’s smart to bar in statue, as the Markey bill would, “. . .any offering that. . . prioritizes traffic over that of other such providers,” which could be read to bar evenhanded offers of prioritized packet routing to any customer who wants to pay a premium, something many net neutrality advocates (including, e.g. Prof. Lessig) have said they think is fine.

My bottom line is that we ought to speak clearly. It might or might not make sense to let the FCC intervene whenever it finds ISPs’ network management to be less than perfect (I think it would not, but recognize the question is debatable). But whatever its merits, a standard like that — removing ISP discretion — deserves a name of its own. Perhaps “least restrictive network management” ?

Cross-posted at the Yale ISP Blog.

avatar

Open Government Data: Starting to Judge the Results

Like many others who read this blog, I’ve spent some time over the last year trying to get more civic data online. I’ve argued that government’s failure to put machine-readable data online is the key roadblock that separates us from a world in which exciting, Web 2.0 style technologies enrich nearly every aspect of civic life. This is an empirical claim, and as more government data comes online, it is being tested.

Jay Nath is the “manager of innovation” for the City and County of San Francisco, working to put municipal data online and build a community of developers who can make the most of it. In a couple of recent blog posts, he has considered the empirical state of government data publishing efforts. Drawing on data from Washington DC, where officials led by then-city CTO Vivek Kundra have put a huge catalog of government data online, he analyzed usage statistics and found an 80/20 pattern of public use of online government data — enormous interest in crime statistics and 311-style service requests, but relatively little about housing code enforcement and almost none about city workers’ use of purchasing credit cards. Here’s the chart: he made (larger version)

Note that this chart measures downloads, not traffic to downstream sites that may be reusing the data.

This analysis was part of a broader effort in San Francisco to begin measuring the return on investments in open government data. One simple measure, as many have remarked before, is foregone IT expenditures that are avoided when third party innovators make it unnecessary for government to provide certain services or make certain investments. But this misses what seems, intuitively, to be the lion’s share of the benefit: New value that didn’t exist before and is created by the extra functionality that third party innovators deliver, but government would not. Another approach is to measure government responsiveness before and after effectiveness data begin to be published. Unfortunately, such measures are unlikely to be controlled — if services get worse, for example, it may have more to do with budget cuts than with any victory, or failure, of citizen monitoring.

Open government data advocates and activists have allies on the inside in a growing number of governmental contexts, from city hall to the White House. But for these allies to be successful, they will need to be able to point to concrete results — sooner and more urgently in the current economic climate than they might have had to do otherwise. This holds a clear lesson for the activists: Small, tangible, steps that turn published government data into cost savings, measurable service improvements, or other concrete goods will “punch above their weight” : not only are they valuable in their own right, but they help favorably disposed civic servants make the case internally for more transparency and disclosure. Beyond aiming for perfection and thinking about the long run, the volunteer community would benefit from seeking low hanging fruit that will prove the concept of open government data and justify further investment.

avatar

The rise of the "nanostory"

In today’s Wall Street Journal, I offer a review of Bill Wasik’s excellent new book, And Then There’s This: How Stories Live and Die in Viral Culture. Cliff’s notes version: This is a great new take on the little cultural boomlets and cryptic fads that seem to swarm all over the Internet. The author draws on his personal experience, including his creation of the still-hilarious Right Wing New York Times. Here’s a taste from the book itself—Wasik describing his decision to create the first flash mob:

It was out of the question to create a project that might last, some new institution or some great work of art, for these would take time, exact cost, require risk, even as their odds of success hovered at nearly zero. Meanwhile, the odds of creating a short-lived sensation, of attracting incredible attention for a very brief period of time, were far more promising indeed… I wanted my new project to be what someone would call “The X of the Summer” before I even contemplated exactly what X might be.

avatar

Recovery Act Spending: Getting to the Bottom Line

Under most circumstances, government spending is slow and deliberate—a key fact that helps reduce the chances of waste and fraud. But the recently passed Recovery Act is a special case: spending the money quickly is understood to be essential to the success of the Act. We all know that shoppers in a hurry tend to get less value for their money. But, ironically, the overall macroeconomic impact of the stimulus (and hence the average stimulative effect per dollar spent) may be maximized by quick spending, even if the speed premium does increase the total amount of waste and abuse.

This situation creates a paradox for transparency and oversight efforts. On the one hand, the quicker pace of spending makes it all the more important to provide for public scrutiny, and to provide information in ways that will rapidly enable as many people as possible to take advantage of the stimulus opportunities available to them. On the other, the same rush that makes transparency important also reduces the time available for those within government to design and build an infrastructure for stimulus transparency.

One of the troubling tradeoffs that has been made thus far involves information about stimulus funds that flow from the federal government to states and then from states to localities. This pattern is rarer than you might think, since much of the Recovery Act spending flows more directly from federal agencies to its end recipients. But for funds that do follow a path from federal to state to local officials, recent guidance issued April 3 by the Office of Management and Budget (OMB) makes clear that the federal reporting infrastructure being created for Recovery.gov will not collect information about what the localities ultimately do with the funds.

OMB says that it does have the legal authority to require detailed reporting on “all levels of subawards,” reaching end recipients (Acme Concrete or whomever gets a contract or grant from the municipality at the end of the governmental chain). But in the context of its sprint to get at least some system into place as soon as possible (with the debut date for the Recovery.gov system already pushed back to October), OMB has left this deep-level reporting out of its immediate plans. The office says that it “plans to expand the reporting model in the future to also obtain this information, once the system capabilities and processes have been established.”

On Monday, ten congressmen sent a letter to OMB urging it to collect this detailed information “as early as possible.” One reason for OMB to formulate detailed operational plans in this area, as I argued in recent testimony before the House Committee on Oversight and Government Reform, is that clarity from the top will help states make competent choices about what if anything they should do to support or supplement the federal reporting. As the members of Congress write:

While it is positive that OMB goes on to reserve the right in the guidance to expand this reporting model in the future, it would seem exercising this right and requiring this level of reporting as early as possible would help entities prepare for the disclosures before projects begin and provide clarification for states as they begin investing in new infrastructure to track ARRA funds.

In the end, everyone agrees that this detailed information about subawards is important to have—OMB “plans to collect” it and the signatories to yesterday’s letter want collection to start “as soon as possible.” But how soon is that? We don’t really know. The details of hard choices facing OMB as it races to implement the Recovery.gov reporting system are themselves not public, and making them public might (or might not) itself slow down the development of the site. If no system were permitted to launch without fully detailed reporting of subawards, we might wait longer for the web site’s launch. How much longer? OMB might not itself be sure, since software development times are notoriously difficult to forecast, and OMB has never before been asked to build a system of this kind. OMB asserts that it’s moving as fast as it can to collect as much information as possible, and without slowing it down to ask for explanations, we can’t really check that assertion.

Transparency often reduces the degree to which citizens must trust public officials. But in this case, ironically, it seems most reasonable to operate on the optimistic but realistic assumption that the people working on Recovery Act transparency are doing their jobs well, and to hope for good results.

avatar

Stimulus transparency and the states

Yesterday, I testified at a field hearing of the U.S. House Committee on Oversight and Government Reform. The hearing title was The American Recovery and Reinvestment Act of 2009: The Role of State and Local Governments.

My written testimony addressed plans to put stimulus data on the Internet, primarily at Recovery.gov. There have been promising signs, but important questions remain open, particularly about stimulus funds that are set to flow through the states. I was reacting primarily to the most recent round of stimulus-related guidance from the Office of Management and Budget (dated April 3).

Based on the probing questions about Recovery.gov that were asked by members from both parties, I’m optimistic that Congressional oversight will be a powerful force to encourage progress toward greater online transparency.

avatar

Possible Opportunity for Outstanding Law Graduates

We are constantly looking for scholars of digital technology and public life to join us at the Center for Information Technology Policy. We’ll be making several appointments soon, and look forward to announcing them. Meanwhile, I wanted to highlight a possible opportunity for graduating law students who have a strong scholarly interest in cyberlaw (reflected in student notes or other publications) and who find themselves in a position to pursue a research project over the coming months.

A growing number of law firms are pushing back the start dates for graduating law students who they have hired as new associates. In some cases, the firms are offering stipends to pay for these new hires to do public interest or academic work in the months before their start dates.

If you happen to be in the overlap between these two groups—a cyber-inclined graduating law student, with support from your firm to do academic work in the coming months—then you should know that CITP may be a logical home for you.

This is part of our larger openness, in general, to externally supported research fellowships. Under the right circumstances, we can provide an intellectual home, complete with workspace and Princeton’s excellent scholarly infrastructure, for exceptional researchers who have a clear project in view and who have a continuing affiliation with their long-term employer (in this case, the law firm).

If you want to know more, feel free to contact me.