As David mentioned in his previous post, plaintiffs’ lawyers in online defamation suits will typically issue a sequence of two “John Doe” subpoenas to try to unmask the identity of anonymous online speakers. The first subpoena goes to the website or content provider where the allegedly defamatory remarks were posted, and the second subpoena is sent to the speaker’s ISP. Both entities—the content provider and the ISP—are natural targets for civil discovery. Their logs together will often contain enough information to trace the remarks back to the speaker’s real identity. But when this isn’t enough to identify the speaker, the discovery process traditionally fails.
Are plaintiffs in these cases out of luck? Not if their lawyers know where else to look.
There are numerous third party web services that may hold just enough clues to reidentify the speaker, even without the help of the content provider or the ISP. The vast majority of websites today depend on third parties to deliver valuable services that would otherwise be too expensive or time-consuming to develop in-house. Services such as online advertising, content distribution and web analytics are almost always handled by specialized servers from third party businesses. As such, a third party can embed its service into a wide variety of sites across the web, allowing it to track users across all the sites where it maintains a presence.
Take for example the popular online blog Boing Boing. Upon loading its main page while recording the HTTP session, I noticed that my browser is automatically redirected to domains owned by no fewer than 17 distinct third party entities: 10 services that engage in advertising or marketing, five that embed media or integrate social networking functionality, and two that provide web analytics. By visiting this single webpage, my digital footprints have been scattered to and collected by at least 17 other online entities that I made no deliberate attempt to contact. And each of these entities will likely have stored a cookie on my web browser, allowing it to identify me uniquely later when I browse to one of its other partner sites. I don’t mean to pick on Boing Boing specifically—taking advantage of third party services is a nearly universal practice on the web today, but it’s exactly this pervasiveness that makes it so likely, if not probable, that all of my digital footprints together could link much of my online activities back to my actual identity.
To make this point concrete, let’s say I post a potentially defamatory remark about someone using a pseudonym in the comments section of a Boing Boing article. It happens that for each article, Boing Boing displays the number of times that the article has been shared on Facebook. In order to fetch the current number, Boing Boing redirects my browser to api.facebook.com to make a real-time query to the Facebook API. Since I happen to be logged in to Facebook at the time of the request, my browser forwards with the query my unique Facebook cookie, which includes information that explicitly identifies me—namely, my e-mail address that doubles as my Facebook username.
In order to integrate a bit of useful social networking functionality, Boing Boing enables Facebook, a third party in this situation, to learn which articles I visit on Boing Boing and the dates and times of my visits. The same is true for Tweetmeme, which can now positively link my Twitter account—which I’m also logged in to—with my Boing Boing visits. Even without an authenticated login, the 15 other third parties present on Boing Boing could track me using any number of different methods, including browser fingerprinting, to build detailed dossiers that slowly begin to piece together who I am.
From the perspective of a plaintiff’s lawyer, even if Boing Boing is unwilling or unable to produce any useful information, these third parties might be able to uniquely identify me as the likely defamer, or at least narrow the list of possible speakers down to a handful of users. But tracing speech is not always this easy. Tomorrow, I’ll discuss more complicated discovery strategies and the extent to which they are technically feasible.