December 22, 2024

Archives for June 2005

GAO Data: Porn Rare on P2P; Filters Ineffective

P2P nets have fewer pornographic images than the Web, and P2P porn filters are ineffective, according to data in a new report from the U.S. Government Accountability Office (GAO).

Mind you, the report’s summary text says pretty much the opposite, but where I come from, data gets more credibility than spin. The data can be found on pages 58-69 of the report. (My PDF reader calls those pages 61-72. To add to the confusion, the pages include images of PowerPoint slides bearing the numbers 53-64.)

The researchers did searches for images, using six search terms (three known to be associated with porn and three innocuous ones) on three P2P systems (Warez, Kazaa, Morpheus) and three search engines (Google, MSN, Yahoo). They looked at the resulting images and classified each image as adult porn, child porn, cartoon porn, adult erotica, cartoon erotica, or other. For brevity, I’ll lump together all of the porn and erotica categories into a meta-category that I’ll call “porne”, so that there are two categories, porne and non-porne.

The first observation from the data is that P2P nets have relatively few porne images, compared to the Web. The eighteen P2P searches found a total of 277 porne images. The eighteen Web searches found at least 655 porne images. But they had to cut off the analysis after the first 100 images of each Web search, because the Web searches returned so many images, so the actual number of Web porne images might have been much larger. (No such truncation was necessary on the P2P searches.)

The obvious conclusion is that if you want to regulate communications technology to keep porne away from kids, you should start with the Web, because it’s a much bigger danger than P2P.

The report also looked at the effectiveness of the porn blocking facilities built into some of the products. The data show pretty clearly that the filters are ineffective at distinguishing porne from non-porne images.

Two of the P2P systems, Kazaa and Morpheus, have built-in porn blocking. The report did the same searches, with and without blocking enabled, and compared the results. They report the data in an odd format, but I have reorganized their data into a more enlightening form. First, let’s look at the results for the three search terms “known to be associated with pornography”. For each term, I’ll report two figures of merit: what percentage of the porne images was blocked by the filter, and what percentage of the non-porne images was (erroneously) blocked by the filter. Here are the results:

Product % Porne Blocked % Non-porne Blocked
Kazaa 100% 100%
Morpheus 83% 69%

Kazaa blocks all of the porne, by the clever expedient of blocking absolutely everything it sees. For non-porne images, Kazaa has a 100% error rate. Morpheus does only slightly better, blocking 83% of the porne, while erroneously blocking “only” 69% of the non-porne. In all, it’s a pretty poor performance.

Here are the results for searches on innocuous search terms (ignoring one term which never yielded any porne):

Product % Porne Blocked % Non-porne Blocked
Kazaa 100% -9%
Morpheus -150% 0%

You may be wondering where the negative percentages come from. According to the report, more images are found with the filters turned on when they are turned off. If the raw data are to be believed, turning on the Morpheus filter more than doubles the amount of porne you can find! There’s obviously something wrong with the data, and it appears to be that searches were done at different times, when very different sets of files were available. This is pretty sloppy experimental technique – enough to cast doubt on the whole report. (One expects better from the GAO.)

But we can salvage some value from this experiment if we assume that even though the total number of files on the P2P net changed from one measurement to the next, the fraction of files that were porne stayed about the same. (If this is not true, then we can’t really trust any of the experiments in the report.) Making this assumption, we can then calculate the percentage of available files that are porne, both with and without blocking.

Product % Porne, without Filter % Porne, with Filter
Kazaa 27% 0%
Morpheus 20% 38%

The Kazaa filter successfully blocks all of the porne, but we don’t know how much of the non-porne it erroneously blocks. The Morpheus filter does a terrible job, actually making things worse. You could do better by just flipping a coin to decide whether to block each image.

So here’s the bottom line on P2P porne filters: you can have a filter that massively overblocks innocuous images, or you can have a filter that sometimes makes things worse and can’t reliably beat a coin flip. Or you can face the fact that these filters don’t help.

(The report also looked at the effectiveness of the built-in porn filters in Web search engines, but due to methodological problems those experiments don’t tell us much.)

The policy prescription here is clear. Don’t mandate the use of filters, because they don’t seem to work. And if you want filters to improve, it might be a good idea to fully legalize research on filtering systems, so people like Seth Finkelstein can finish the job the GAO started.

BitTorrent: The Next Main Event

Few tears will be shed if Grokster and StreamCast are driven out of business as a result of the Supreme Court’s decision. The companies are far from lovable, and their technology is yesterday’s news anyway.

A much more important issue is what the rules will be for the next generation of technologies. Here the Court did not offer the clarity we might have hoped for, opting instead for what Tim Wu has described as the Miss Manners rule, under which vendors must avoid showing an unseemly interest in infringing uses of their products. This would appear to protect vendors who are honestly uninterested in forstering infringement, as well as those who are very interested but manage to hide it.

Lower courts will be left to apply the Grokster Court’s inducement rule to the facts of other file distribution technologies. How far will lower courts go? Will they go too far?

The litmus test is BitTorrent. Here is a technology that is widely used for both infringing and non-infringing purposes, with infringement probably predominating today. And yet: It was originally created to support noninfringing sharing (of concert recordings, with permission). Its creator, Bram Cohen, seems interested only in noninfringing uses, and has said all the right things about infringement – so consistently that one can only conclude he is sincere. BitTorrent is nicely engineered, offering novel benefits to infringing and noninfringing users alike. It is available for free, so there is no infringement-based business model. In short, BitTorrent looks like a clear example of the kind of dual-use technology that ought to pass the Court’s active inducement test.

A court that followed the Grokster analysis closely would have to let BitTorrent off the hook. To do otherwise, I think, would be to institute a de facto predominant-use test, finding BitTorrent liable because too many of its users infringed. This might be dressed up as an inducement analysis, but it would be clear to everybody what was going on. Given the squishiness of the Grokster analysis, we can’t rule this out.

So the stage is set for the next phase of the copyright/technology litigation war. The music and movie industries don’t want to live in a world where BitTorrent is allowed to exist. The Supreme Court didn’t give them enough yesterday to kill BitTorrent. So the industries’ goal will be to stretch the Grokster rule, just as they tried to stretch the Sony rule before hitting a sandbar in the Grokster district court. We’ll see a careful campaign of litigation against peer-to-peer services, trying to gradually stretch the noose of inducement liability until it fits around BitTorrent’s neck. Failing that, we’ll see a push to get Congress to codify (the industries’ interepretation of) the Grokster rule.

The real winners, as usual, are the copyright lawyers.

Patry: The Court Punts

William Patry (a distinguished copyright lawyer) offers an interesting take on Grokster. He says that the court was unable to come to agreement on how to apply the Sony Betamax precedent to Grokster, and so punted the issue.

Legality of Design Decisions, and Footnote 12 in Grokster

As a technologist I find the most interesting, and scariest, part of the Grokster opinion to be the discussion of product design decisions. The Court seems to say that Sony bars liability based solely on product design (p. 16):

Sony barred secondary liability based on presuming or imputing intent to cause infringement solely from the design of distribution of a product capable of substantial lawful use, which the distributor knows is in fact used for infringement.

And again (on p. 17),

Sony‘s rule limits imputing culpable intent as a matter of law from the characteristics or uses of a distributed product.

But when it comes time to lay out the evidence of intent to foster infringement, we get this (p. 22):

Second, this evidence of unlawful objective is given added significance of MGM’s showing that neither company attempted to develop filtering tools or other mechanisms to diminish the infringing activity using their software. While the Ninth Circuit treated the defendants’ failure to develop such tools as irrelevant because they lacked an independent duty to monitor their users’ activity, we think this evidence underscores Grokster’s and StreamCast’s intentional facilitation of their users’ infringement.

It’s hard to square this with the previous statements that intent is not to be inferred from the characteristics of the product. Perhaps the answer is in -footnote 12, which the court hangs off the last word in the previous quote:

Of course, in the absence of other evidence of intent, a court would be unable to find contributory infringement liability merely based on a failure to take affirmative steps to prevent infringement, if the device otherwise was capable of substantial noninfringing uses. Such a holding would tread too close to the Sony safe harbor.

So it seems that product design decisions are not to be questioned, unless there is some other evidence of bad intent to open the door.

To make things worse, the Court here criticizes Grokster and StreamCast for making a very reasonable engineering decision. There is every reason to believe that filtering technology would add to the cost and complexity of the companies’ software, without substantially reducing infringement. (We discussed this issue in the computer science professors’ brief.) In short, the Court here engages in exactly the kind of design second-guessing that technologists fear.

Legitimate technologists will still worry that a well-funded plaintiff can cook up a stew of product design second-guessing, business model second-guessing, and occasional failures of copyright compliance by low-level employees, into an active inducement case. This risk existed before, and the Court today hasn’t done much to reduce it.

Business Model as Evidence of Intent

One interesting aspect of Justice Souter’s majority opinion in Grokster is the criticism of the business models of StreamCast and Grokster (pp. 22-23):

Third, there is a further complement to the direct evidence of unlawful objective. It is useful to recall that StreamCast and Grokster make money by selling advertising space, by directing ads to the screens of computers employing their software. As the record shows, the more the software is used, the more ads are sent out and the greater the advertising revenue becomes. Since the extent of the software’s use determines the gain to the distributors, the commercial sense of their enterprise turns on high-volume use, which the record shows is infringing. This evidence alone would not justify an inference of unlawful intent, but viewed in the context of the entire record its import is clear.

It’s hard to think of any conceivable business model for a software company under which an increase in use of the product does not lead to an increase in revenue. If you sell software, greater use allows you to increase the price, or to sell more units. Likewise if you sell software by subscription. If you give away the software and make money on auxiliary products or services, you’ll still benefit from increased usage.

Certainly Sony’s profits would have increased the more people used Betamaxes. The same is true for iPods, TiVos, photocopiers, and many other legitimate products. Profiting from use seems like pretty poor evidence of intent to cause infringement.