April 23, 2014

avatar

The Latest in Nationwide Internet User Identification – Part 2 (the All-New, So-Called Federal Co-Conspirator Theory)

Since Part 1 in this series a few months ago, Plaintiffs have continued to file “pure bill of discovery” suits in Florida state court. These proceedings typically involve “John Does” who are accused of copyright infringement via peer-to-peer networks. The Plaintiffs (copyright-holders or their delegates) have continued to name as defendants in those “pure discovery” proceedings not the entities from whom they seek discovery (i.e., the Internet service providers) but instead John Does, from whom no discovery is sought. After filing their suits, Plaintiffs promptly seek and obtain an ex parte order for expedited discovery of the John Does’ names from the ISPs, even though the ISPs are not then represented or present in the proceeding. Because the ISPs are not technically parties, the Plaintiffs can use these orders to issue subpoenas to ISPs from across the country regardless of whether the ISPs or their subscribers would be subject to the jurisdiction of a Florida state court.

The Plaintiffs’ lawyers certainly must know that this is not right. For one thing, they tend to withdraw their subpoenas whenever it appears a court is actually going to hear the reasons why their use of the proceeding is improper.

Recently, several ISPs stood firm and proceeded to a hearing on their motions for protective order in a couple of these proceedings. The Plaintiffs’ lawyers, in typical fashion, tried to withdraw their subpoenas and argued that the judges should not listen to the ISPs’ arguments. Not surprisingly, the Plaintiffs did not fare well in an adversarial proceeding.

[Read more...]

avatar

The Latest in Nationwide Internet User Identification – Part 1 (The Ancient State Law "Pure Bill of Discovery")

Plaintiffs are engaging in aggressive and questionable new tactics in a growing wave of federal copyright “John Doe” lawsuits. In those lawsuits, the obvious objective of the plaintiffs is to discover from Internet Service Providers (ISPs) the personal identities of many of the ISPs’ subscribers. The plaintiffs typically present the ISPs with long lists of subscriber IP addresses that have allegedly been used in copyright infringement. Many of these plaintiffs have generated a business model around such suits and are often referred to as “copyright trolls“. The orders permitting “John Doe” discovery necessarily precede the naming of the defendants, and many if not most defendants are likely to settle rather than bear the expense of a defense (not to mention, in many cases, the embarrassment of association with pornographic works). Thus, at least for those defendants, the lawsuits effectively begin and end when their names and contact information are provided to the plaintiffs. Many of the copyright plaintiff attorneys would have it no other way – operating form-based lawsuit “factories” and harvesting settlements, and getting out without presenting any evidence at trial.

The response of the federal judges has been mixed. Many of them just grant the requested relief. In the interest of protecting privacy rights, a few judges have properly appointed attorneys ad litem to represent the unidentified Does. Some have decided that the joinder of numerous defendants in a single lawsuit is improper, and dismissed all the Does except for a single John or Jane. Others have required that the plaintiffs demonstrate a good faith belief that the subscriber-defendants reside in the forum and/or are otherwise subject to the personal jurisdiction of the court.

More recently, the copyright plaintiffs are turning to the state courts – an odd tactic given that copyright infringement claims may only be asserted in federal court. Remember, though, that these plaintiffs appear to be far more interested in the personally identifiable information of Internet subscribers (and coercing settlements), than in the actual pursuit of litigation. As such, they are simply motivated to seek, in the least number of lawsuits, as many Internet subscriber identifications for as many IP address/date/time stamps from as many ISPs as possible.

Consistent with such an objective, the plaintiffs’ lawyers have dusted off an ancient proceeding known as a “pure bill of discovery” – an equitable action that originated in the 19th century, before discovery was even available in legal proceedings under common law. As it turns out, this action is still available under a narrow set of circumstances in some states, including Florida, primarily where discovery is not otherwise obtainable and there is no adequate remedy at law.

Plaintiffs use this action to seek discovery in state court – presumably to avoid some of the same hurdles encountered in federal court. In Florida (the preferred jurisdiction so far), they contend that they should be permitted to file a “pure bill of discovery” for any alleged infringement, so long as they can somehow connect the alleged infringement to that jurisdiction (for example, because another alleged member of the same BitTorrent “swarm” – who could even be the plaintiff’s forensic investigator – was allegedly located in Florida).

But these plaintiffs aren’t using the “pure bill of discovery” the way it is supposed to work.

Because the “pure bill of discovery” is for the sole purpose of obtaining discovery, the “defendants” in such an action should be the person from whom the information is sought. Here, that would be the ISPs. However, suing dozens and dozens of ISPs located across the country in a Florida state court could be inconvenient and costly to the plaintiffs given that the ISPs would need to be served with process and a significant number of the ISPs would likely resist. In addition, if there were actual adversaries (i.e., ISP defendants), the plaintiffs would have to demonstrate their rights and convince the court that they are entitled to relief in an adversarial hearing before an order could be issued and before any subpoenas could be issued.

Preferring otherwise, the plaintiffs are suing the (unrepresented, unnamed, and defenseless) Doe defendants in their “pure bill of discovery” actions. That doesn’t make sense, you may say, because the plaintiffs are not seeking any discovery from the Does. True – in a “pure bill of discovery” action, the plaintiff has to be seeking discovery from the defendants in that action. To address this detail, the plaintiffs’ lawyers fictionally assert that they are seeking to require the Does to “confirm” that the identifying information to be provided by the ISPs is “accurate.” And, naturally, before the Doe defendants can “confirm” that they are who they are said to be, the plaintiffs need to uncover their names. So, after filing the lawsuit in a state court, the plaintiffs file an ex parte motion for discovery seeking to issue discovery requests to a long list of ISPs located across the nation (many beyond the state court’s jurisdiction), to obtain the personally identifiable information of hundreds of individual subscribers (i.e., the John Does). These ex parte motions actually get granted tout de suite.

Although the ISPs (much less the John Does) don’t have any opportunity to be heard beforehand, the ISPs can oppose the discovery requests once those requests are served on them. As a practical matter, though, most of the ISPs don’t; and those that do may simply be met with a voluntary dismissal by the plaintiff (as to those Does only), who would presumably rather not have the court actually hear the arguments made. Thus, the plaintiffs for the most part can readily obtain the necessary personally identifiable information to threaten to sue the alleged infringers (in federal court) and, in all likelihood, obtain quick settlement.

To the extent these plaintiffs get away with it, they have found a way to obtain a court order without opposition that permits nationwide identification of mass defendants in a single lawsuit. Assuming the Doe defendants settle, and anecdotal evidence suggests that many do, bothersome details such as service of process, personal jurisdiction, venue, joinder, and even advocacy in a court of law can be avoided entirely.

And why stop with seeking federal copyright claims? If these proceedings can actually be used in the way the plaintiffs are using them, there’s no reason why anyone couldn’t sue in Florida state court in order to get identifying subscriber information for subscribers located anywhere, from any ISP or other communications provider, under any legal theory. It seems to be the perfect tool of stealth and expedience, unless you happen to believe in the protection of fundamental individual rights and that the role of our judicial system is to resolve cases or controversies. It is hard to imagine that this antediluvian equitable action was intended to serve as a settlement weapon in abusive mass copyright litigation.

avatar

Erroneous DMCA notices and copyright enforcement, part deux

A few weeks ago, I wrote about a deluge of DMCA notices and pre-settlement letters that CoralCDN experienced in late August. This article actually received a bit of press, including MediaPost, ArsTechnica, TechDirt, and, very recently, Slashdot. I’m glad that my own experience was able to shed some light on the more insidious practices that are still going on under the umbrella of copyright enforcement. More transparency is especially important at this time, given the current debate over the Anti-Counterfeiting Trade Agreement.

Given this discussion, I wanted to write a short follow-on to my previous post.

The VPA drops Nexicon

First and foremost, I was contacted by the founder of the Video Protection Alliance not long after this story broke. I was informed that the VPA has not actually developed its own technology to discover users who are actively uploading or downloading copyrighted material, but rather contracts out this role to Nexicon. (You can find a comment from Nexicon’s CTO to my previous article here.) As I was told, the VPA was contracted by certain content publishers to help reduce copyright infringement of (largely adult) content. The VPA in turn contracted Nexicon to find IP addresses that are participating in BitTorrent swarms of those specified movies. Using the IP addresses given them by Nexicon, the VPA subsequently would send pre-settlement letters to the network providers of those addresses.

The VPA’s founder also assured me that their main goal was to reduce infringement, as opposed to collecting pre-settlement money. (And that users had been let off with only a warning, or, in the cases where infringement might have been due to an open wireless network, informed how to secure their wireless network.) He also expressed surprise that there were false positives in the addresses given to them (beyond said open wireless), especially to the extent that appropriate verification was lacking. Given this new knowledge, he stated that the VPA dropped their use of Nexicon’s technology.

BitTorrent and Proxies

Second, I should clarify my claims about BitTorrent’s usefulness with an on-path proxy. While it is true that the address registered with the BitTorrent tracker is not usable, peers connecting from behind a proxy can still download content from other addresses learned from the tracker. If their requests to those addresses are optimistically unchoked, they have the opportunity to even engage in incentivized bilateral exchange. Furthermore, the use of DHT- and gossip-based discovery with other peers—the latter is termed PEX, for Peer EXchange, in BitTorrent—allows their real address to be learned by others. Thus, through these more modern discovery means, other peers may initiate connections to them, further increasing the opportunity for tit-for-tat exchanges.

Some readers also pointed out that there is good reason why BitTorrent trackers do not just accept any IP address communicated to it via an HTTP query string, but rather use the end-point IP address of the TCP connection. Namely, any HTTP query parameter can be spoofed, leading to anybody being able to add another’s IP address to the tracker list. That would make them susceptible to receiving DMCA complaints, just we experienced with CoralCDN. From a more technical perspective, their machine would also start receiving unsolicited TCP connection requests from other BitTorrent peers, an easy DoS amplification attack.

That said, there are some additional checks that BitTorrent trackers could do. For example, if the IP query string or X-Forwarded-For HTTP headers are present, only add the network IP address if it matches the query string or X-Forwarded-For headers. Additionally, some BitTorrent tracker operators have mentioned that they have certain IP addresses whitelisted as trusted proxies; in those cases, the X-Forwarded-For address is used already. Otherwise, I don’t see a good reason (plausible deniability aside) for recording an IP address that is known to be likely incorrect.

Best Practices for Online Technical Copyright Enforcement

Finally, my article pointed out a strategy that I clearly thought was insufficient for copyright enforcement: simply crawling a BitTorrent tracker for a list of registered IP addresses, and issuing a infringement notice to each IP address. I’ll add to that two other approaches that I think are either insufficient, unethical, or illegal—or all three—yet have been bandied about as possible solutions.

  • Wiretapping: It has been suggested that network providers can perform deep-packet inspection (DPI) on their customer’s traffic in order to detect copyrighted content. This approach probably breaks a number of laws (either in the U.S. or elsewhere), creates a dangerous precedent and existing infrastructure for far-flung Internet surveillance, and yet is of dubious benefit given the move to encrypted communication by file-sharing software.
  • Spyware: By surreptitiously installing spyware/malware on end-hosts, one could scan a user’s local disk in order to detect the existence of potentially copyrighted material. This practice has even worse legal and ethical implications than network-level wiretapping, and yet politicians such as Senator Orrin Hatch (Utah) have gone as far as declaring that infringers’ computers should be destroyed. And it opens users up to the real danger that their computers or information could be misused by others; witness, for example, the security weaknesses of China’s Green Dam software.

So, if one starts from the position that copyrights are valid and should be enforceable—some dispute this—what would you like to see as best practices for copyright enforcement?

The approach taken by DRM is to try to build a technical framework that restricts users’ ability to share content or to consume it in a proscribed manner. But DRM has been largely disliked by end-users, mostly in the way it creates a poor user experience and interferes with expected rights (under fair-use doctrine). But DRM is a misleading argument, as copyright infringement notices are needed precisely after “unprotected” content has already flown the coop.

So I’ll start with two properties that I would want all enforcement agencies to take when issuing DMCA take-down notices. Let’s restrict this consideration to complaints about “whole” content (e.g., entire movies), as opposed to those DMCA challenges over sampled or remixed content, which is a legal debate.

  • For any end client suspected of file-sharing, one MUST verify that the client was actually uploading or downloading content, AND that the content corresponded to a valid portion of a copyrighted file. In BitTorrent, this might be that the client sends or receives a complete file block, and that the file block hashes to the correct value specified in the .torrent file.
  • When issuing a DMCA take-down notice, the request MUST be accompanied by logged information that shows (a) the client’s IP:port network address engaged in content transfer (e.g., a record of a TCP flow); (b) the actual application request/response that was acted upon (e.g., BitTorrent-level logs); and (c) that the transferred content corresponds to a valid file block (e.g., a BitTorrent hash).

So my question to the readers: What would you add to or remove from this list? With what other approaches do you think copyright enforcement should be performed or incentivized?

avatar

Inaccurate Copyright Enforcement: Questionable "best" practices and BitTorrent specification flaws

[Today we welcome my Princeton Computer Science colleague Mike Freedman. Mike's research areas include computer systems, network software, and security. He writes a technical blog about these topics at Princeton S* Network Systems -- required reading for serious systems geeks like me. -- Ed Felten]

In the past few weeks, Ed has been writing about targeted and inaccurate copyright enforcement. While it may be difficult to quantify the actual extent of inaccurate claims, we can at least try to understand whether copyright enforcement companies are making a “good faith” best effort to minimize any false positives. My short answer: not really.

Let’s start with a typical abuse letter that gets sent to a network provider (in this case, a university) from a “copyright enforcement” company such as the Video Protection Alliance.

This notice is intended solely for the primary Massachusetts Institute of Technology internet service account holder. Someone using this account has engaged in illegal copying or distribution (downloading or uploading) of …

Evidence:
Infringement Source: BitTorrent
Infringement Timestamp: 2009-08-28 09:33:20 PST
Infringers IP Address: 128.31.1.13
Infringers Port: 40951

The information in this notification is accurate. We have a good faith belief that use of the material in the manner complained of herein is not authorized by the copyright owner, its agent, or by operation of law. We swear under penalty of perjury, that we are authorized to act on behalf of DISCOUNT VIDEO CENTER INC..

You and everyone using this computer must immediately and permanently cease and desist the unauthorized copying and/or distribution (including, but not limited to, downloading, uploading, file sharing, file ‘swapping’ or other similar activities) of the videos and/or other content owned by DISCOUNT VIDEO CENTER INC., including, but is not limited to, the copyrighted material listed above.

DISCOUNT VIDEO CENTER INC. is prepared to pursue every available remedy including damages, recovery of attorney’s fees, costs and any and all other claims that may be available to it in a lawsuit filed against you.

While DISCOUNT VIDEO CENTER INC. is entitled to monetary damages, attorneys’ fees and court costs from the infringing party under 17 U.S.C. 504, DISCOUNT VIDEO CENTER INC. believes that it may be beneficial to settle this matter without the need of costly and time-consuming litigation. We have been authorized to offer a reasonable settlement to resolve the infringement of the works listed above. To access this settlement offer, please follow the directions below.

Settlement Offer: To access your settlement offer please copy and paste the address below into a browser and follow the instructions:

https://www.videoprotectionalliance.com/?n_id=AB-XXXXXX

Password: XXXXXXX

In other words: we have a record of you (supposedly) uploading and downloading BitTorrent content. That content is copyrighted. We could pursue costly and painful litigation, but if you want us to just go away, you can pay us now.

Now, any type of IP-based identification is not going to be perfect, especially given the wide-spread use of Network Address Translation (NAT) boxes and open WiFi at homes. Especially in dense urban areas, unapproved third parties might use their neighbor’s wireless network for Internet access, potentially leading to the wrong homeowner being blamed. And IP-based identification relies on accurate ISP mappings from IP addresses to users, as these mappings change over time (although typically slowly) given dynamic address assignment (i.e., DHCP). But one could rightly claim that such sources of false positives are rare in practice and that a enforcement company is still making a best effort to accurately identify IP addresses engaging in copyright-infringing file sharing.

So what’s a reasonable strategy to identify such infringing behavior?

Let me first give a high-level overview of how BitTorrent works. To download a particular file on BitTorrent, a client first needs to discover a set of other peers that have the file. Earlier peer-to-peer systems like Napster, Gnutella, and KaZaA had peers connect to one another somewhat randomly (or, in Napster, through a more centralized directory service). These peers would then broadcast search requests for files, downloading the content directly from those peers that responded as having matching files. In the basic BitTorrent architecture, on the other hand, the global ecosystem is split into distinct groups of users that are all trying to download a particular file. Each such group—known as a swarm—is managed by a centralized server called a tracker. The tracker keeps a list of the swarm’s peers and, for each peer, a bit-vector of which file blocks it already has. When a client joins a swarm by announcing itself to the tracker, it gets a list of other peers, and it subsequently attempts to connect to them and download file blocks. How a client discovers a particular swarm is outside the scope of the system, but there are plenty of BitTorrent search engines that allow clients to perform keyword searches. These searches return .torrent files, which includes high-level meta-data about a particular swarm, including the URL(s) at which its tracker(s) can be accessed.

So there are three phases to downloading content from BitTorrent:

  1. Finding a .torrent meta-data file
  2. Registering with the .torrent’s tracker and getting a list of peer addresses
  3. Connecting to a peer, swapping the bit-vector of which file blocks each has, and potentially downloading or uploading needed blocks

Unfortunately, the verification that copyright enforcement agencies such as the VPA use stops at #2. That is, if some random BitTorrent tracker lists your IP address as being part of a swarm, then the VPA considers this to be sufficient proof to warrant a DMCA takedown notice (such as the one above), with clear instructions on how to pay a monetary settlement. Now, a very reasonable question is whether such information should indeed constitute proof.

Last year, researchers at the University of Washington published a paper with the subtitle Why My Printer Received a DMCA Takedown Notice. Their conclusions were that:

  • Practically any Internet user can be framed for copyright infringement today.
  • Even without being explicitly framed, innocent users may still receive complaints.

The title came from the fact that they “registered” the IP address of a networked printer with BitTorrent trackers, and they subsequently received 9 DMCA takedown notices claiming that their printer was engaging in illegal file sharing. (They did not, however, receive any pre-settlement offers such as the one above, which suggests a possible escalation of enforcement techniques since then.)

I have had my own repeated experiences with such false claims. This September, for instance, a research system I operate called CoralCDN received approximately 100 pre-settlement letters, including the one above. A little background: CoralCDN is an open, free, self-organizing content distribution network (CDN). CDNs are widely used by commercial high-volume websites to scalably deliver their content, such as Hulu’s use of Akamai or CNN’s use of Level 3. CoralCDN was designed to help solve the Slashdot effect, which is when portals such as slashdot.org link to underprovisioned third-party sites and cause that site to become quickly overwhelmed by the unexpected surge of resulting traffic. CoralCDN’s answer was to provide an open CDN that would cache and serve any URL that was requested from it. To use CoralCDN, one simply appends a suffix to a URL’s hostname, i.e., http://www.cnn.com/ becomes http://www.cnn.com.nyud.net/. CoralCDN’s been running on PlanetLab—a distributed research testbed of virtualized servers, spread over several hundred universities worldwide—since March 2004. It handles requests from about 2 million users per day.

Because CoralCDN provides an open platform, one can access any URL through it via an HTTP GET request (with the exception of a small number of blacklisted domains and those for content larger than 50MB). Thus, requests to BitTorrent trackers can also use CoralCDN, as these are simply HTTP GETs with a client’s relevant information encoded in the tracker URL’s query string, e.g., http://denis.stalker.h3q.com.6969.nyud.net/announce?info_hash=(hash)&peer_id=(name)&port=52864&uploaded=231374848&downloaded=2227372596&left=0&corrupt=0&key=E0591124&numwant=200&compact=1&no_peer_id=1.

Notice that the HTTP request includes a peer’s unique name (a long random string) and a port number, but notably does not include an IP address for that client. It’s an optional parameter in the specification that many BitTorrent clients don’t include. (In fact, even if the request includes this IP parameter, some trackers ignore it.) Instead, the tracker records the network-level IP address from where the HTTP request originated (the other end of the TCP connection), together with the supplied port, as the peer’s network address.

When this request is via an HTTP proxy, things go wrong. Here, the BitTorrent client is connecting to an HTTP proxy, which in turn is connecting to the tracker. So this practice results in the tracker recording an unusable address: the combination of the proxy’s IP and the client’s port. Needless to say, the proxy isn’t running BitTorrent, let alone on that particular (often randomized) port. Not only does this design damage the client’s BitTorrent experience—other clients won’t initiate communication with it, leading to fewer opportunities for “tit-for-tat” data exchanges—but this also damages the entire swarm’s performance: Others’ requests to this hybrid address will all fail (typically with an RST response to the TCP connection request). I was rather surprised to find this flaw in the BitTorrent specification.

So how is this related to CoralCDN and the VPA? For whatever reason, some publisher started including a Coralized URL for the tracker’s location, as shown above (http://denis.stalker.h3q.com.6969.nyud.net/). I could only surmise why this was done: perhaps on the (mistaken) assumption that it would reduce load on the server, or perhaps in the hope of offloading abuse complaints to CoralCDN servers. The latter might have been useful if copyright enforcement agencies were going after the trackers, instead of the participating peers. In fact, we initially thought this was the case when these pre-settlement letters from the VPA started rolling in. More careful analysis, however, exposed the above problem: when the BitTorrent URL was Coralized, peers’ requests to the tracker were issued via CoralCDN HTTP proxies. Thus, the tracker built up a list of peer addresses of the form (CoralCDN IP : peer port), where these CoralCDN IPs correspond to PlanetLab servers located at various universities.

Hence, when the VPA began sending out pre-settlement letters claiming infringement, they sent them to network operators at tens of universities, who turned around and forwarded them to PlanetLab’s central operations and me.

What is particularly striking about this case, however, is that these reports were demonstrably false! There was no BitTorrent client running at the specified address (in the above letter, 128.31.1.13:40951), for precisely the reasons I discuss. Thus, we can fairly definitively conclude that the VPA never actually tested the peer for actual infringement: not even by trying to connect to the client’s address, let alone determining whether the client was actually uploading or download any data, and let alone valid data corresponding to the copyrighted file in question.

This begs the question as to what should be required for a company to issue a DMCA notification and pre-settlement letters that assert:

Someone using this account has engaged in illegal copying or distribution (downloading or uploading)…The information in this notification is accurate. We have a good faith belief that use of the material in the manner complained of herein is not authorized by the copyright owner.

Of course, the incentives for the VPA to actually ensure that “this notification is accurate” are pretty clear. The cost of a false positive is currently nothing, and perhaps some innocent users will even “buy protection” to make this problem and the threat of costly litigation go away.

DISCOUNT VIDEO CENTER INC. believes that it may be beneficial to settle this matter without the need of costly and time-consuming litigation. We have been authorized to offer a reasonable settlement to resolve the infringement of the works listed above.

It appears that the VPA and other such agencies have been rather effective at getting some settlement money. Our personal experience with DMCA takedown notices is that network operators are suitably afraid of litigation. Many will pull network access from machines as soon as a complaint is received, without any further verification or demonstrative network logs. In fact, many operators also sought “proof” that we weren’t running BitTorrent or engaging in file sharing before they were willing to restore access. We’ll leave the discussion about how we might prove such a negative to another day, but one can point to the chilling effect that such notices have had, when users are immediately considered guilty and must prove their innocence.

I am not arguing that copyright owners should not be able to take reasonable steps to protect their copyrighted material. I am arguing, however, that they should take similarly reasonable steps to ensure that any claimed infringement actually took place. When DMCA notices are accompanied by oaths under “penalty of perjury” and these claims are accepted as writ, as they have de facto become, there should some downside for agencies that demonstrably do not act in “good faith” to verify infringement. Even a simple TCP connection attempt would have been enough to dispel their flawed assumptions. That currently seems to be too much to ask.

Update (Dec 15): A follow-up post can be found here.

avatar

A Freedom-of-Speech-based Approach To Limiting Filesharing – Part III: Smoke, smoke!

Over the past two days we have seen that filesharing is vulnerable to spamming, and that as a defense, the filesharers have used the IP block list to exclude the spammers from sharing files. Today I discuss how I think lawyers and laypeople should look at the legal issues. Since I am most decidedly not a lawyer, nothing I say here should be considered definitive. Hopefully, it is at least interesting.

An analogy:

Washington Square, in New York City, was for many years a place where drugs were sold. A fellow would stand around quietly saying to passersby “Smoke, smoke!” However, this so-called “steerer” held no drugs. His role was simply to direct the buyer to the “pitcher”, who had the drugs somewhere nearby, and who kept silent.

Even the strongest defender of free-speech rights understands that the “steerer’s” words are not just speech. His words are not similar to those of this article, though both simply say that someone in the park is selling. He is as legally responsible for the sale as the “pitcher”, because they are, according to legal terminology, “acting in concert”. He is a drug dealer who may never touch any drugs. Note also that the “steerer” receives payments from the illegal transactions – though it is not in fact legally necessary to be able to prove the payments to establish that he’s “acting in concert”. All that’s required is that the “steerer” and the “pitcher” share “community of purpose” in facilitating the illegal transaction.

In the Napster case, the court held that Napster, even though it did not have any copyrighted data on its servers, was liable for contributory infringement. To use Napster, a downloader would login to Napster’s central server, which connected the user to another user who had a file that was being searched for. Since it was Napster’s role to hook up the parties illegally exchanging files, it is reasonable to see this as analogous to the “steerer” in Washington Square – Napster didn’t have the infringing materials, but that really isn’t a defense.

The gnutella network is decentralized to solve the legal problem presented by the Napster decision. Nonetheless, there is something still centralized in gnutella: the IP block list. Users of LimeWire get their block list from LimeWire and only from LimeWire. Accordingly, if Napster was like the “steerer” in Washington Square, LimeWire furthers the “community of purpose” in a different way; it is someone who gives negative information rather than affirmative. He’s someone paid to stand in the park pointing out who are cheaters selling bad drugs, allowing the purchasers to find the good stuff.

What is a legitimate P2P spam filtering authority versus one that shares “community of purpose” with infringers? The former could legitimately act to keep the network from being flooded by those selling weight loss drugs, without facilitating infringing. There is probably no bright-line rule, but it is reasonably clear that LimeWire is well on the wrong side of any possible grey area.

It’s useful to compare gnutella spam cop LimeWire with e-mail spam cop AOL.

LimeWire does not clearly advertise its spam cop role as a feature of its software, and does not discuss its block list. (The LimeWire web site has only the cryptic description “We’re always working to protect you from viruses and unwanted sharing.”) There is no discussion anywhere about what sorts of sites and files it is blocking and for what reason. No notification is given by LimeWire to a site when it is blocked, nor is there any way given to contact LimeWire to remove yourself from the block list.

In comparison, blocking e-mail spam is, for AOL, a major selling point. AOL does not block bulk e-mailers (many of which are legitimate) on a whim. Every e-mail rejected by AOL is bounced with a notification to the sender, and there are detailed instructions to bulk e-mailers as to what they need to do to avoid running afoul of AOL’s filters. There is a way to contact AOL to remove oneself from the block list, if one is legitimate. The whole process is transparent.

It is clear that a legitimate spam cop cannot block spoofers, since any search for a non-infringing file would be unmolested by spoofs, yet it appears that LimeWire does block MediaDefender. In fact, LimeWire appears to be quietly promising to do so, when it says that it protects against “unwanted sharing”, whatever that is.

Lastly, it appears that LimeWire’s statements in court conceal what it is doing.

As we mentioned in the first post, there is an ongoing case, Arista v Lime Group. In its motion for Summary Judgement, LimeWire states

Likewise, LW does not have the ability to control the manner in which users employ the LimeWire software. Unlike the Napster defendants, LW does not maintain central servers containing files or indices of files. … LW’s system is like that analysed by the Ninth Circuit in Grokster, “truly decentralized”. … LW no more controls the actions of its customers than do any of the thousands of companies that provide hardware or other software used in connection with the internet.

This omits any discussion of LimeWire’s centralized block list. LW assuredly does control the manner in which LimeWire users employ the LimeWire software, because if a site is added to the IP block list, it is no longer visible to most LimeWire users. This is very far from the normal situation applying in other software used in connection with the internet.

Moreover, the plaintiffs’ attorneys appear to be unaware of the blocking of spoofs, as their reply motion makes no mention of it (nor the other hidden features of LimeWire software discussed yesterday).

While it might be possible to run a legitimate spam-blocking service for P2P networks, it would look rather different from what LimeWire is doing.

Conclusion

The best way to regulate filesharing effectively is to analyze the various players’ roles on free-speech grounds. The individual filesharers (when they share infringing material) are certainly violating the law, but in a small way that probably can’t be reasonably controlled. The publishers of the software that allows the network to run (including LimeWire) are exercising free speech – the fact that their code can be made to do something illegal should be irrelevant. However, LimeWire is facilitating infringing because of the way it runs its IP block list. If LimeWire were shut down, the gnutella network become useless for downloading infringing music. Because of their actions to keep the network safe for infringers – their “acting in concert” – LimeWire should be liable for contributory infringement.

This course will avoid free speech restrictions that trouble many. In terms of preventing infringing, it also will be far more productive than trying to target the small fish. It is an effective measure that respects rights.

[This series of posts has been a somewhat shortened version of an article here.]

avatar

A Freedom-of-Speech Approach To Limiting Filesharing – Part I: Filesharing and Spam

[Today we kick off a series of three guest posts by Mitch Golden. Mitch was a professor of physics when, in 1995, he was bitten by the Internet bug and came to New York to become an entrepreneur and consultant. He has worked on a variety of Internet enterprises, including one in the filesharing space. As usual, the opinions expressed in these posts are Mitch's alone. -- Ed]

The battle between the record labels and filesharers has been somewhat out of the news a bit of late, but it rages on still. There is an ongoing court case Arista Records v LimeWire, in which a group of record labels are suing to have LimeWire held accountable for the copyright infringing done by its users. Though this case has attracted less attention than similar cases before it, it may raise interesting issues not addressed in previous cases. Though I am a technologist, not a lawyer, this series of posts will advocate a way of looking at the issues, including legal, using a freedom-of-speech based approach, which leads to some unusual conclusions.

Let’s start by reviewing some salient features of filesharing.

Filesharing is a way for a group of people – who generally do not know one another – to allow one another to see what files they collectively have on their machines, and to exchange desired files with each other. There are at least two components to a filesharing system: one allows a user who is looking for a particular file to see if someone has it, and another that allows the file to be transferred from one machine to the other.

One of the most popular filesharing programs in current use is LimeWire, which uses a protocol called gnutella. Gnutella is decentralized, in the sense that neither the search nor the exchange of files requires any central server. It is possible, therefore, for people to exchange copyrighted files – in violation of the law – without creating any log of the search or exchange in a central repository.

The gnutella protocol was originally created by developers from Nullsoft, the company that had developed the popular music player WinAmp, shortly after it was acquired by AOL. AOL was at that time merging with Time Warner, a huge media company, and so the idea that they would be distributing a filesharing client was quite unamusing to management. Work was immediately discontinued; however, the source for the client and the implementation of the protocol had already been released under the GPL, and so development continued elsewhere. LimeWire made improvements both to the protocol and the interface, and their client became quite popular.

The decentralized structure of filesharing does not serve a technical purpose. In general, centralized searching is simpler, quicker and more efficient, and so, for example, to search the web we use Google or Yahoo, which are gigantic repositories. In filesharing, the decentralized search structure instead serves a legal purpose: to diffuse the responsibility so no particular individual or organization can be held accountable for promoting the illegal copying of copyright materials. At the time the original development was going on, the Napster case was in the news, in which the first successful filesharing service was being sued by the record labels. The outcome of that case a few months later resulted in Napster being shut down, as the US courts held it (which was a centralized search repository) responsible for the copyright infringing file sharing its users were doing.

Whatever their legal or technical advantages, decentralized networks, by virtue of their openness, are vulnerable to a common problem: spam. For example, because anyone may send anyone else an e-mail, we are all subject to a deluge of messages trying to sell us penny stocks and weight loss remedies. Filesharing too is subject this sort of cheating. If someone is looking for, say, Rihanna’s recording Disturbia, and downloads an mp3 file that purports to be such, what’s to stop a spammer from instead serving a file with an audio ad for a Canadian pharmacy?

Spammers on the filesharing networks, however, have more than just the usual commercial motivations in mind. In general, there are four categories of fake files that find their way onto the network.

  • Commercial spam
  • Pornography and Ads for Pornography
  • Viruses and trojans
  • Spoof files

The last of these has no real analogue to anything people receive in e-mail It works as follows: if, for example, Rihanna’s record label wants to prevent you from downloading Disturbia, they might hire a company called MediaDefender. MediaDefender’s business is to put as many spoof files as possible on gnutella that purport to be Disturbia, but instead contain useless noise. If MediaDefender can succeed in flooding the network so that the real Disturbia is needle in a haystack, then the record label has thwarted gnutella’s users from violating their copyright.

Since people are still using filesharing, clearly a workable solution has been found to the problem of spoof files. In tomorrow’s post, I discuss this solution, and in the following post, I suggest its legal ramifications.

avatar

Study Shows DMCA Takedowns Based on Inconclusive Evidence

A new study by Michael Piatek, Yoshi Kohno and Arvind Krishnamurthy at the University of Washington shows that copyright owners’ representatives sometimes send DMCA takedown notices where there is no infringement – and even to printers and other devices that don’t download any music or movies. The authors of the study received more than 400 spurious takedown notices.

Technical details are summarized in the study’s FAQ:

Downloading a file from BitTorrent is a two step process. First, a new user contacts a central coordinator [a "tracker" – Ed] that maintains a list of all other users currently downloading a file and obtains a list of other downloaders. Next, the new user contacts those peers, requesting file data and sharing it with others. Actual downloading and/or sharing of copyrighted material occurs only during the second step, but our experiments show that some monitoring techniques rely only on the reports of the central coordinator to determine whether or not a user is infringing. In these cases whether or not a peer is actually participating is not verified directly. In our paper, we describe techniques that exploit this lack of direct verification, allowing us to frame arbitrary Internet users.

The existence of erroneous takedowns is not news – anybody who has seen the current system operating knows that some notices are just wrong, for example referring to unused IP addresses. Somewhat more interesting is the result that it is pretty easy to “frame” somebody so they get takedown notices despite doing nothing wrong. Given this, it would be a mistake to infer a pattern of infringement based solely on the existence of takedown notices. More evidence should be required before imposing punishment.

Now it’s not entirely crazy to send some kind of soft “warning” to a user based on the kind of evidence described in the Washington paper. Most of the people who received such warnings would probably be infringers, and if it’s nothing more than a warning (“Hey, it looks like you might be infringing. Don’t infringe.”) it could be effective, especially if the recipients know that with a bit more work the copyright owner could gather stronger evidence. Such a system could make sense, as long as everybody understood that warnings were not evidence of infringement.

So are copyright owners overstepping the law when they send takedown notices based on inconclusive evidence? Only a lawyer can say for sure. I’ve read the statute and it’s not clear to me. Readers who have an informed opinion on this question are encouraged to speak up in the comments.

Whether or not copyright owners can send warnings based on inconclusive evidence, the notification letters they actually send imply that there is strong evidence of infringement. Here’s an excerpt from a letter sent to the University of Washington about one of the (non-infringing) study computers:

XXX, Inc. swears under penalty of perjury that YYY Corporation has authorized XXX to act as its non-exclusive agent for copyright infringement notification. XXX’s search of the protocol listed below has detected infringements of YYY’s copyright interests on your IP addresses as detailed in the attached report.

XXX has reasonable good faith belief that use of the material in the manner complained of in the attached report is not authorized by YYY, its agents, or the law. The information provided herein is accurate to the best of our knowledge. Therefore, this letter is an official notification to effect removal of the detected infringement listed in the attached report. The attached documentation specifies the exact location of the infringement.

The statement that the search “has detected infringements … on your IP addresses” is not accurate, and the later reference to “the detected infringement” also misleads. The letter contains details of the purported infringement, which once again give the false impression that the letter’s sender has verified that infringement was actually occurring:

Evidentiary Information:
Notice ID: xx-xxxxxxxx
Recent Infringement Timestamp: 5 May 2008 20:54:30 GMT
Infringed Work: Iron Man
Infringing FileName: Iron Man TS Kvcd(A Karmadrome Release)KVCD by DangerDee
Infringing FileSize: 834197878
Protocol: BitTorrent
Infringing URL: http://tmts.org.uk/xbtit/announce.php
Infringers IP Address: xx.xx.xxx.xxx
Infringer’s DNS Name: d-xx-xx-xxx-xxx.dhcp4.washington.edu
Infringer’s User Name:
Initial Infringement Timestamp: 4 May 2008 20:22:51 GMT

The obvious question at this point is why the copyright owners don’t do the extra work to verify that the target of the letter is actually transferring copyrighted content. There are several possibilities. Perhaps BitTorrent clients can recognize and shun the detector computers. Perhaps they don’t want to participate in an act of infringement by sending or receiving copyrighted material (which would be necessary to know that something on the targeted computer is willing to transfer it). Perhaps it simply serves their interests better to send lots of weak accusations, rather than fewer stronger ones. Whatever the reason, until copyright owners change their practices, DMCA notices should not be considered strong evidence of infringement.

avatar

Comcast's Disappointing Defense

Last week, Comcast offered a defense in the FCC proceeding challenging the technical limitations it had placed on BitTorrent traffic in its network. (Back in October, I wrote twice about Comcast’s actions.)

The key battle line is whether Comcast is just managing its network reasonably in the face of routine network congestion, as it claims, or whether it is singling out certain kinds of traffic for unnecessary discrimination, as its critics claim. The FCC process has generated lots of verbiage, which I can’t hope to discuss, or even summarize, in this post.

I do want to call out one aspect of Comcast’s filing: the flimsiness of its technical argument.

Here’s one example (p. 14-15).

As Congresswoman Mary Bono Mack recently explained:

The service providers are watching more and more of their network monopolized by P2P bandwidth hogs who command a disproportionate amount of their network resources. . . . You might be asking yourself, why don’t the broadband service providers invest more into their networks and add more capacity? For the record, broadband service providers are investing in their networks, but simply adding more bandwidth does not solve [the P2P problem]. The reason for this is P2P applications are designed to consume as much bandwidth as is available, thus more capacity only results in more consumption.

(emphasis in original). The flaws in this argument start with the fact that the italicized segment is wrong. P2P protocols don’t aim to use more bandwidth rather than less. They’re not sparing with bandwidth, but they don’t use it for no reason, and there does come a point where they don’t want any more.

But even leaving aside the merits of the argument, what’s most remarkable here is that Comcast’s technical description of BitTorrent cites as evidence not a textbook, nor a standards document, nor a paper from the research literature, nor a paper by the designer of BitTorrent, nor a document from the BitTorrent company, nor the statement of any expert, but a speech by a member of Congress. Congressmembers know many things, but they’re not exactly the first group you would turn to for information about how network protocols work.

This is not the only odd source that Comcast cites. Later (p. 28) they claim that the forged TCP Reset packets that they send shouldn’t be called “forged”. For this proposition they cite some guy named George Ou who blogs at ZDNet. They give no reason why we should believe Mr. Ou on this point. My point isn’t to attack Mr. Ou, who for all I know might actually have some relevant expertise. My point is that if this is the most authoritative citation Comcast can find, then their argument doesn’t look very solid. (And, indeed, it seems pretty uncontroversial to call these particular packets “forged”, given that they mislead the recipient about (1) which IP address sent the packet, and (2) why the packet was sent.)

Comcast is a big company with plenty of resources. It’s a bit depressing that they would file arguments like this with the FCC, an agency smart enough to tell the difference. Is this really the standard of technical argumentation in FCC proceedings?

avatar

Could Use-Based Broadband Pricing Help the Net Neutrality Debate?

Yesterday, thanks to a leaked memo, it came to light that Time Warner Cable intends to try out use-based broadband pricing on a few of its customers. It looks like the plan is for several tiers of use, with the heaviest users possibly paying overage charges on a per-byte basis. In confirming its plans to Reuters, Time Warner pointed out that its heaviest-using five percent of customers generate the majority of data traffic on the network, but still pay as though they were typical users. Under the new proposal, pricing would be based on the total amount of data transferred, rather than the peak throughput on a connection.

If the current, flattened pricing is based on what the connection is worth to a typical customer, who makes only limited use of the connection, then the heaviest five percent of users (let’s call them super-users as shorthand) are reaping a surplus. Bandwidth use might be highly elastic with respect to price, but I think it is also true that the super users do reap a great deal more benefit from their broadband connections than other users do – think of those who pioneer video consumption online, for example.

What happens when network operators fail to see this surplus? They have marginally less incentive to build out the network and drive down the unit cost of data transfer. If the pricing model changed so that network providers’ revenue remained the same in total but was based directly on how much the network is used, then the price would go down for the lightest users and up for the heaviest. If a tiered structure left prices the same for most users and raised them on the heaviest, operators’ total revenue would go up. In either case, networks would have an incentive to encourage innovative, high-bandwidth uses of their networks – regardless of what kind of use that is.

Gigi Sohn of Public Knowledge has come out in favor of Time Warner’s move on these and other grounds. It’s important to acknowledge that network operators still have familiar, monopolistic reasons to intervene against traffic that competes with phone service or cable. But under the current pricing structure, they’ve had a relatively strong argument to discriminate in favor of the traffic they can monetize, and against the traffic they can’t. By allowing them to monetize all traffic, a shift to use based pricing would weaken one of the most persuasive reasons network operators have to oppose net neutrality.

avatar

Universal Didn't Ignore Digital, Just Did It Wrong

Techies have been chortling all week about comments made by Universal Music CEO Doug Morris to Wired’s Seth Mnookin. Morris, despite being in what is now a technology-based industry, professed extreme ignorance about the digital world. Here’s the money quote:

Morris insists there wasn’t a thing he or anyone else could have done differently. “There’s no one in the record company that’s a technologist,” Morris explains. “That’s a misconception writers make all the time, that the record industry missed this. They didn’t. They just didn’t know what to do. It’s like if you were suddenly asked to operate on your dog to remove his kidney. What would you do?”

Personally, I would hire a vet. But to Morris, even that wasn’t an option. “We didn’t know who to hire,” he says, becoming more agitated. “I wouldn’t be able to recognize a good technology person — anyone with a good bullshit story would have gotten past me.” Morris’ almost willful cluelessness is telling. “He wasn’t prepared for a business that was going to be so totally disrupted by technology,” says a longtime industry insider who has worked with Morris. “He just doesn’t have that kind of mind.”

Morris’s explanation isn’t just pathetic, it’s also wrong. The problem wasn’t that the company had no digital strategy. They had a strategy, and they had technologists on the payroll who were supposed to implement it. But their strategy was a bad one, combining impractical copy-protection schemes with locked-down subscription services that would appeal to few if any customers.

The most interesting side of the story is that Universal’s strategy is improving now – they’re selling unencumbered MP3s, for example – even though the same proud technophobe is still in charge.

Why the change?

The best explanation, I think, is a fear that Apple would use its iPod/iTunes technologies to grab control of digital music distribution. If Universal couldn’t quite understand the digital transition, it could at least recognize a threat to its distribution channel. So it responded by competing – that is, trying to give customers what they wanted.

Still, if I were a Universal shareholder I wouldn’t let Morris off the hook. What kind of manager, in an industry facing historic disruption, is uninterested in learning about the source of that disruption? A CEO can’t be an expert on everything. But can’t the guy learn just a little bit about technology?