April 21, 2014

avatar

GAO Data: Porn Rare on P2P; Filters Ineffective

P2P nets have fewer pornographic images than the Web, and P2P porn filters are ineffective, according to data in a new report from the U.S. Government Accountability Office (GAO).

Mind you, the report’s summary text says pretty much the opposite, but where I come from, data gets more credibility than spin. The data can be found on pages 58-69 of the report. (My PDF reader calls those pages 61-72. To add to the confusion, the pages include images of PowerPoint slides bearing the numbers 53-64.)

The researchers did searches for images, using six search terms (three known to be associated with porn and three innocuous ones) on three P2P systems (Warez, Kazaa, Morpheus) and three search engines (Google, MSN, Yahoo). They looked at the resulting images and classified each image as adult porn, child porn, cartoon porn, adult erotica, cartoon erotica, or other. For brevity, I’ll lump together all of the porn and erotica categories into a meta-category that I’ll call “porne”, so that there are two categories, porne and non-porne.

The first observation from the data is that P2P nets have relatively few porne images, compared to the Web. The eighteen P2P searches found a total of 277 porne images. The eighteen Web searches found at least 655 porne images. But they had to cut off the analysis after the first 100 images of each Web search, because the Web searches returned so many images, so the actual number of Web porne images might have been much larger. (No such truncation was necessary on the P2P searches.)

The obvious conclusion is that if you want to regulate communications technology to keep porne away from kids, you should start with the Web, because it’s a much bigger danger than P2P.

The report also looked at the effectiveness of the porn blocking facilities built into some of the products. The data show pretty clearly that the filters are ineffective at distinguishing porne from non-porne images.

Two of the P2P systems, Kazaa and Morpheus, have built-in porn blocking. The report did the same searches, with and without blocking enabled, and compared the results. They report the data in an odd format, but I have reorganized their data into a more enlightening form. First, let’s look at the results for the three search terms “known to be associated with pornography”. For each term, I’ll report two figures of merit: what percentage of the porne images was blocked by the filter, and what percentage of the non-porne images was (erroneously) blocked by the filter. Here are the results:

Product % Porne Blocked % Non-porne Blocked
Kazaa 100% 100%
Morpheus 83% 69%

Kazaa blocks all of the porne, by the clever expedient of blocking absolutely everything it sees. For non-porne images, Kazaa has a 100% error rate. Morpheus does only slightly better, blocking 83% of the porne, while erroneously blocking “only” 69% of the non-porne. In all, it’s a pretty poor performance.

Here are the results for searches on innocuous search terms (ignoring one term which never yielded any porne):

Product % Porne Blocked % Non-porne Blocked
Kazaa 100% -9%
Morpheus -150% 0%

You may be wondering where the negative percentages come from. According to the report, more images are found with the filters turned on when they are turned off. If the raw data are to be believed, turning on the Morpheus filter more than doubles the amount of porne you can find! There’s obviously something wrong with the data, and it appears to be that searches were done at different times, when very different sets of files were available. This is pretty sloppy experimental technique – enough to cast doubt on the whole report. (One expects better from the GAO.)

But we can salvage some value from this experiment if we assume that even though the total number of files on the P2P net changed from one measurement to the next, the fraction of files that were porne stayed about the same. (If this is not true, then we can’t really trust any of the experiments in the report.) Making this assumption, we can then calculate the percentage of available files that are porne, both with and without blocking.

Product % Porne, without Filter % Porne, with Filter
Kazaa 27% 0%
Morpheus 20% 38%

The Kazaa filter successfully blocks all of the porne, but we don’t know how much of the non-porne it erroneously blocks. The Morpheus filter does a terrible job, actually making things worse. You could do better by just flipping a coin to decide whether to block each image.

So here’s the bottom line on P2P porne filters: you can have a filter that massively overblocks innocuous images, or you can have a filter that sometimes makes things worse and can’t reliably beat a coin flip. Or you can face the fact that these filters don’t help.

(The report also looked at the effectiveness of the built-in porn filters in Web search engines, but due to methodological problems those experiments don’t tell us much.)

The policy prescription here is clear. Don’t mandate the use of filters, because they don’t seem to work. And if you want filters to improve, it might be a good idea to fully legalize research on filtering systems, so people like Seth Finkelstein can finish the job the GAO started.

avatar

BitTorrent: The Next Main Event

Few tears will be shed if Grokster and StreamCast are driven out of business as a result of the Supreme Court’s decision. The companies are far from lovable, and their technology is yesterday’s news anyway.

A much more important issue is what the rules will be for the next generation of technologies. Here the Court did not offer the clarity we might have hoped for, opting instead for what Tim Wu has described as the Miss Manners rule, under which vendors must avoid showing an unseemly interest in infringing uses of their products. This would appear to protect vendors who are honestly uninterested in forstering infringement, as well as those who are very interested but manage to hide it.

Lower courts will be left to apply the Grokster Court’s inducement rule to the facts of other file distribution technologies. How far will lower courts go? Will they go too far?

The litmus test is BitTorrent. Here is a technology that is widely used for both infringing and non-infringing purposes, with infringement probably predominating today. And yet: It was originally created to support noninfringing sharing (of concert recordings, with permission). Its creator, Bram Cohen, seems interested only in noninfringing uses, and has said all the right things about infringement – so consistently that one can only conclude he is sincere. BitTorrent is nicely engineered, offering novel benefits to infringing and noninfringing users alike. It is available for free, so there is no infringement-based business model. In short, BitTorrent looks like a clear example of the kind of dual-use technology that ought to pass the Court’s active inducement test.

A court that followed the Grokster analysis closely would have to let BitTorrent off the hook. To do otherwise, I think, would be to institute a de facto predominant-use test, finding BitTorrent liable because too many of its users infringed. This might be dressed up as an inducement analysis, but it would be clear to everybody what was going on. Given the squishiness of the Grokster analysis, we can’t rule this out.

So the stage is set for the next phase of the copyright/technology litigation war. The music and movie industries don’t want to live in a world where BitTorrent is allowed to exist. The Supreme Court didn’t give them enough yesterday to kill BitTorrent. So the industries’ goal will be to stretch the Grokster rule, just as they tried to stretch the Sony rule before hitting a sandbar in the Grokster district court. We’ll see a careful campaign of litigation against peer-to-peer services, trying to gradually stretch the noose of inducement liability until it fits around BitTorrent’s neck. Failing that, we’ll see a push to get Congress to codify (the industries’ interepretation of) the Grokster rule.

The real winners, as usual, are the copyright lawyers.

avatar

Patry: The Court Punts

William Patry (a distinguished copyright lawyer) offers an interesting take on Grokster. He says that the court was unable to come to agreement on how to apply the Sony Betamax precedent to Grokster, and so punted the issue.

avatar

Legality of Design Decisions, and Footnote 12 in Grokster

As a technologist I find the most interesting, and scariest, part of the Grokster opinion to be the discussion of product design decisions. The Court seems to say that Sony bars liability based solely on product design (p. 16):

Sony barred secondary liability based on presuming or imputing intent to cause infringement solely from the design of distribution of a product capable of substantial lawful use, which the distributor knows is in fact used for infringement.

And again (on p. 17),

Sony‘s rule limits imputing culpable intent as a matter of law from the characteristics or uses of a distributed product.

But when it comes time to lay out the evidence of intent to foster infringement, we get this (p. 22):

Second, this evidence of unlawful objective is given added significance of MGM’s showing that neither company attempted to develop filtering tools or other mechanisms to diminish the infringing activity using their software. While the Ninth Circuit treated the defendants’ failure to develop such tools as irrelevant because they lacked an independent duty to monitor their users’ activity, we think this evidence underscores Grokster’s and StreamCast’s intentional facilitation of their users’ infringement.

It’s hard to square this with the previous statements that intent is not to be inferred from the characteristics of the product. Perhaps the answer is in -footnote 12, which the court hangs off the last word in the previous quote:

Of course, in the absence of other evidence of intent, a court would be unable to find contributory infringement liability merely based on a failure to take affirmative steps to prevent infringement, if the device otherwise was capable of substantial noninfringing uses. Such a holding would tread too close to the Sony safe harbor.

So it seems that product design decisions are not to be questioned, unless there is some other evidence of bad intent to open the door.

To make things worse, the Court here criticizes Grokster and StreamCast for making a very reasonable engineering decision. There is every reason to believe that filtering technology would add to the cost and complexity of the companies’ software, without substantially reducing infringement. (We discussed this issue in the computer science professors’ brief.) In short, the Court here engages in exactly the kind of design second-guessing that technologists fear.

Legitimate technologists will still worry that a well-funded plaintiff can cook up a stew of product design second-guessing, business model second-guessing, and occasional failures of copyright compliance by low-level employees, into an active inducement case. This risk existed before, and the Court today hasn’t done much to reduce it.

avatar

Business Model as Evidence of Intent

One interesting aspect of Justice Souter’s majority opinion in Grokster is the criticism of the business models of StreamCast and Grokster (pp. 22-23):

Third, there is a further complement to the direct evidence of unlawful objective. It is useful to recall that StreamCast and Grokster make money by selling advertising space, by directing ads to the screens of computers employing their software. As the record shows, the more the software is used, the more ads are sent out and the greater the advertising revenue becomes. Since the extent of the software’s use determines the gain to the distributors, the commercial sense of their enterprise turns on high-volume use, which the record shows is infringing. This evidence alone would not justify an inference of unlawful intent, but viewed in the context of the entire record its import is clear.

It’s hard to think of any conceivable business model for a software company under which an increase in use of the product does not lead to an increase in revenue. If you sell software, greater use allows you to increase the price, or to sell more units. Likewise if you sell software by subscription. If you give away the software and make money on auxiliary products or services, you’ll still benefit from increased usage.

Certainly Sony’s profits would have increased the more people used Betamaxes. The same is true for iPods, TiVos, photocopiers, and many other legitimate products. Profiting from use seems like pretty poor evidence of intent to cause infringement.

avatar

Grokster Loses

The Supreme Court ruled unanimously against Grokster, finding the company’s actions to be illegal. (Reported by SCOTUSblog.) Expect an explosion of discussion in the blogosphere. My usual one-post-a-day limit will be suspended today.

Unanimous opinion of the Court (written by Souter)
Concurrence of Ginsburg (joined by Rehnquist and Kennedy)
Concurrence of Breyer (joined by Stevens and O’Connor)

I’ll be participating in a special Grokster discussion over at SCOTUSblog, along with several distinguished lawyers. Everything I post here will be duplicated there, and vice versa.

Also, Randy Picker is organizing a lawprof “mobblawg” about today’s Grokster and BrandX rulings, with an impressive group of participants.

avatar

Book Club Discussion: Code, Chapters 3 and 4

This week in Book Club we read Chapters 3 and 4 of Lawrence Lessig’s Code, and Other Laws of Cyberspace.

Now it’s time to discuss the chapters. I’m especially eager to see discussion of this week’s chapters, and not just general reflections on the book as a whole.

You can chime in by entering a comment below.

For next week, we’ll read Chapter 5.

avatar

Content Filtering and Security

Buggy security software can make you less secure. Indeed, a growing number of intruders are exploiting bugs in security software to gain access to systems. Smart system administrators have known for a long time to be careful about deploying new “security” products.

A company called Audible Magic is trying to sell “content filtering” systems to universities and companies. The company’s CopySense product is a computer that sits at the boundary between an organization’s internal network and the Internet. CopySense watches the network traffic going by, and tries to detect P2P transfers that involve infringing content, in order to log them or block them. It’s not clear how accurate the system’s classifiers are, as Audible Magic does not allow independent evaluation. The company claims that CopySense improves security, by blocking dangerous P2P traffic.

It seems just as likely that CopySense makes enterprise networks less secure. CopySense boxes run general-purpose operating systems, so they are prone to security bugs that could allow an outsider to seize control of them. And a compromised CopySense system would be very bad news, an ideal listening post for the intruder, positioned to watch all incoming and outgoing network traffic.

How vulnerable is CopySense? We have no way of knowing, since Audible Magic doesn’t allow independent evaluation of the product. You have to sign an NDA to get access to a CopySense box.

This in itself should be cause for suspicion. Hard experience shows that companies that are secretive about the design of their security technology tend to have weaker systems than companies that are more open. If I were an enterprise network administrator, I wouldn’t trust a secret design like CopySense.

Audible Magic could remedy this problem and show confidence in their design by lifting their restrictive NDA requirements, allowing independent evaluation of their product and open discussion of its level of security. They could do this tomorrow. Until they do, their product should be considered risky.

avatar

Regulation by Software

The always interesting James Grimmelmann has a new paper, Regulation by Software (.pdf), on how software relates to law. He starts by dissecting Lessig’s “code is law” argument. Lessig argues that code is a form of “architecture” – part of the environment in which we live. And we know that the shape of our living environment regulates behavior, in the sense that we would behave differently if our environment were different.

Orin Kerr at Volokh wrote about Grimmelmann’s paper, leading to a vigorous discussion. Commenters, including Dan Simon, argued that if all designed objects regulate, then the observation that software regulates in the same way isn’t very useful. If toothpicks regulate, and squeaky tennis shoes regulate, what makes software so special?

Which brings us to the point of Grimmelmann’s paper. He argues that software is very different from ordinary physical objects, so that software-based regulation is not the same animal as object-based regulation. It’s best, he says, to think of software as a different medium of regulation.

Software-based regulation has four characteristics, according to Grimmelmann. It is extremely formal and rule-bound. It can impose rules without disclosing what the rules are. Its rules are always applied and cannot be ignored by mutual agreement. It is fragile since software tends to be insecure and buggy.

Regulation by software will work best, Grimmelmann argues, where these four characteristics are consistent with the regulator’s goals. He looks at two case studies, and finds that software is ill-suited for controlling access to copyrighted works, but software does work well for managing online marketplaces. Both findings are consistent with reality.

This is a useful contribution to the discussion, and it couldn’t have come at a better time for Freedom to Tinker book club members.

avatar

Another reason for reforming the DMCA

I’ll be signing off my guest-blog stint at Freedom to Tinker now. (Thanks for your hospitality, Prof. Felten.)

Before I go, I wanted to point you to a chapter excerpt from “Darknet” I just posted here It tells the story of how the vice president of Intel Corp. violated the Digital Millennium Copyright Act (DMCA) without realizing it — by making a home movie of his son playing Pop Warner football and incorporating snippets of a Hollywood DVD.

As the VP, Donald S. Whiteside, told a Congressional delegation:

“This is precisely the kind of exciting consumer creativity that should be enabled. I don’t claim to have all the answers. Should I have to go clear rights to use ten seconds from Rudy in my son’s video, or does it fall under fair use? Should I have to pay pennies for every second of a snippet? I don’t know. But I do know that we have to figure out a way for consumers to do something creative without breaking the law.

“To me, this episode was a great way to frame the question: Should copyright law permit this or not? Should the DMCA criminalize this sort of thing? Or should the creative community, high-tech community, and lawmakers get together to try to stimulate this kind of innovative behavior?”

Well put.

— J.D. Lasica