October 18, 2017

Archives for June 2005

GAO Data: Porn Rare on P2P; Filters Ineffective

P2P nets have fewer pornographic images than the Web, and P2P porn filters are ineffective, according to data in a new report from the U.S. Government Accountability Office (GAO).

Mind you, the report’s summary text says pretty much the opposite, but where I come from, data gets more credibility than spin. The data can be found on pages 58-69 of the report. (My PDF reader calls those pages 61-72. To add to the confusion, the pages include images of PowerPoint slides bearing the numbers 53-64.)

The researchers did searches for images, using six search terms (three known to be associated with porn and three innocuous ones) on three P2P systems (Warez, Kazaa, Morpheus) and three search engines (Google, MSN, Yahoo). They looked at the resulting images and classified each image as adult porn, child porn, cartoon porn, adult erotica, cartoon erotica, or other. For brevity, I’ll lump together all of the porn and erotica categories into a meta-category that I’ll call “porne”, so that there are two categories, porne and non-porne.

The first observation from the data is that P2P nets have relatively few porne images, compared to the Web. The eighteen P2P searches found a total of 277 porne images. The eighteen Web searches found at least 655 porne images. But they had to cut off the analysis after the first 100 images of each Web search, because the Web searches returned so many images, so the actual number of Web porne images might have been much larger. (No such truncation was necessary on the P2P searches.)

The obvious conclusion is that if you want to regulate communications technology to keep porne away from kids, you should start with the Web, because it’s a much bigger danger than P2P.

The report also looked at the effectiveness of the porn blocking facilities built into some of the products. The data show pretty clearly that the filters are ineffective at distinguishing porne from non-porne images.

Two of the P2P systems, Kazaa and Morpheus, have built-in porn blocking. The report did the same searches, with and without blocking enabled, and compared the results. They report the data in an odd format, but I have reorganized their data into a more enlightening form. First, let’s look at the results for the three search terms “known to be associated with pornography”. For each term, I’ll report two figures of merit: what percentage of the porne images was blocked by the filter, and what percentage of the non-porne images was (erroneously) blocked by the filter. Here are the results:

Product % Porne Blocked % Non-porne Blocked
Kazaa 100% 100%
Morpheus 83% 69%

Kazaa blocks all of the porne, by the clever expedient of blocking absolutely everything it sees. For non-porne images, Kazaa has a 100% error rate. Morpheus does only slightly better, blocking 83% of the porne, while erroneously blocking “only” 69% of the non-porne. In all, it’s a pretty poor performance.

Here are the results for searches on innocuous search terms (ignoring one term which never yielded any porne):

Product % Porne Blocked % Non-porne Blocked
Kazaa 100% -9%
Morpheus -150% 0%

You may be wondering where the negative percentages come from. According to the report, more images are found with the filters turned on when they are turned off. If the raw data are to be believed, turning on the Morpheus filter more than doubles the amount of porne you can find! There’s obviously something wrong with the data, and it appears to be that searches were done at different times, when very different sets of files were available. This is pretty sloppy experimental technique – enough to cast doubt on the whole report. (One expects better from the GAO.)

But we can salvage some value from this experiment if we assume that even though the total number of files on the P2P net changed from one measurement to the next, the fraction of files that were porne stayed about the same. (If this is not true, then we can’t really trust any of the experiments in the report.) Making this assumption, we can then calculate the percentage of available files that are porne, both with and without blocking.

Product % Porne, without Filter % Porne, with Filter
Kazaa 27% 0%
Morpheus 20% 38%

The Kazaa filter successfully blocks all of the porne, but we don’t know how much of the non-porne it erroneously blocks. The Morpheus filter does a terrible job, actually making things worse. You could do better by just flipping a coin to decide whether to block each image.

So here’s the bottom line on P2P porne filters: you can have a filter that massively overblocks innocuous images, or you can have a filter that sometimes makes things worse and can’t reliably beat a coin flip. Or you can face the fact that these filters don’t help.

(The report also looked at the effectiveness of the built-in porn filters in Web search engines, but due to methodological problems those experiments don’t tell us much.)

The policy prescription here is clear. Don’t mandate the use of filters, because they don’t seem to work. And if you want filters to improve, it might be a good idea to fully legalize research on filtering systems, so people like Seth Finkelstein can finish the job the GAO started.

BitTorrent: The Next Main Event

Few tears will be shed if Grokster and StreamCast are driven out of business as a result of the Supreme Court’s decision. The companies are far from lovable, and their technology is yesterday’s news anyway.

A much more important issue is what the rules will be for the next generation of technologies. Here the Court did not offer the clarity we might have hoped for, opting instead for what Tim Wu has described as the Miss Manners rule, under which vendors must avoid showing an unseemly interest in infringing uses of their products. This would appear to protect vendors who are honestly uninterested in forstering infringement, as well as those who are very interested but manage to hide it.

Lower courts will be left to apply the Grokster Court’s inducement rule to the facts of other file distribution technologies. How far will lower courts go? Will they go too far?

The litmus test is BitTorrent. Here is a technology that is widely used for both infringing and non-infringing purposes, with infringement probably predominating today. And yet: It was originally created to support noninfringing sharing (of concert recordings, with permission). Its creator, Bram Cohen, seems interested only in noninfringing uses, and has said all the right things about infringement – so consistently that one can only conclude he is sincere. BitTorrent is nicely engineered, offering novel benefits to infringing and noninfringing users alike. It is available for free, so there is no infringement-based business model. In short, BitTorrent looks like a clear example of the kind of dual-use technology that ought to pass the Court’s active inducement test.

A court that followed the Grokster analysis closely would have to let BitTorrent off the hook. To do otherwise, I think, would be to institute a de facto predominant-use test, finding BitTorrent liable because too many of its users infringed. This might be dressed up as an inducement analysis, but it would be clear to everybody what was going on. Given the squishiness of the Grokster analysis, we can’t rule this out.

So the stage is set for the next phase of the copyright/technology litigation war. The music and movie industries don’t want to live in a world where BitTorrent is allowed to exist. The Supreme Court didn’t give them enough yesterday to kill BitTorrent. So the industries’ goal will be to stretch the Grokster rule, just as they tried to stretch the Sony rule before hitting a sandbar in the Grokster district court. We’ll see a careful campaign of litigation against peer-to-peer services, trying to gradually stretch the noose of inducement liability until it fits around BitTorrent’s neck. Failing that, we’ll see a push to get Congress to codify (the industries’ interepretation of) the Grokster rule.

The real winners, as usual, are the copyright lawyers.

Patry: The Court Punts

William Patry (a distinguished copyright lawyer) offers an interesting take on Grokster. He says that the court was unable to come to agreement on how to apply the Sony Betamax precedent to Grokster, and so punted the issue.