January 19, 2025

Copyright Education: Harder Than It Looks

This afternoon I’m going to lead a discussion among twenty-five bright Princeton students, about the basics of copyright. Why do we have copyright? Why does it cover expression and not ideas? Why fair use? The answers are subtle, but I hope to guide the discussion toward finding them.

I can say for sure that a flat “downloading = shoplifting” argument would be torn to shreds in minutes. This equation seems wrong to most people, and it is wrong. Copyrights differ from traditional property in important ways. That doesn’t mean that copyright isn’t justified, but it does mean that the justification for copyright doesn’t follow from the justification for ordinary property. It will take a room full of college students a while to sort through all of this.

Let’s face it, this is challenging material, even for smart, motivated twenty-year-olds.

Meanwhile, JD Lasica notes that in fourth-grade classrooms, the BSA’s anticopying ferret (who seems, amusingly, to have been copied himself) will try to explain the same concepts to nine-year-olds. Cory Doctorow observes that this is crazy. Telling nine-year-olds that they have to understand copyright before they can use the Internet is like telling them that they have to understand employment taxes before they can run a lemonade stand.

I pity the fourth-grade teacher who, having read the BSA’s Teacher’s Guide, has to explain exactly what it is that is being stolen when a kid copies an image from the Barbie website to use as a placemat at dinner. If I were that teacher, I would prefer simpler questions like “Why are people mean to each other?” and “How did the universe start?”

Splitting the Grokster Baby

David Post at the Volokh Conspiracy predicts, astutely, the outcome of the Grokster case. He predicts that the Supreme Court will try to split the baby by overturning the lower court decision (which Hollywood is asking for) while upholding the Sony Betamax doctrine immunizing designers of dual-use technologies from secondary liability (which technologists are asking for). How will the Court do this? Here’s Post:

The Court has an easy “out” here, and my experience has been that when they’re presented with an easy out they usually grab it. The Ninth Circuit in this case affirmed the grant of summary judgment to Grokster, holding that on any reasonable version of the facts, Grokster could not be held liable for “contributory copyright infringement” because the software involved is “capable of substantial non-infringing uses” under the Sony v. [Universal] case. The record company plaintiffs want the Court to “tighten up” the Sony standard, and to say, in effect, that the non-infringing uses that these P2P networks have are not “substantial” enough under Sony.

That would be a disaster for technology providers — but I don’t think that’s what the Court will say. Instead, I think the Court will send the case back to the Ninth Circuit and say: you were right that, under Sony, the non-infringing uses here are substantial enough so that, standing alone, providers of these p2p technologies can’t be held liable for the copyright infringements of network users. But — and here’s the critical part — on these facts, it doesn’t stand alone; there’s evidence in this record that Grokster and the other defendants actively encouraged and induced its customers to infringe copyrights, and that inducement of this kind is not protected by the Sony safe harbor. The Court will then instruct the Ninth Circuit to re-open the case and evaluate whether or not this evidence is enough to hold the defendants liable on an inducement, or “aiding and abetting,” theory of liability.

In doing this, the Court would be drawing a line between acts of technology design, which would not trigger secondary liability, as long as the technology is capable of substantial noninfringing use, and other acts, which could trigger secondary liability. If the Court isn’t careful to draw this line carefully, we could be left with a terrible muddle.

Consider, for instance, a vendor’s decision not to try to incorporate filtering technologies into its product. This is a decision about the design of the product, but the Hollywood briefs argue that it is also (or instead) a decision about which market to enter, i.e. a non-design decision. Ideally, the Court would make clear that this is a design decision and therefore protected under Sony. But if the Court leaves this issue unaddressed or, worse yet, simply hints at moral disapproval of Grokster’s lack of filtering, technologists may be left in the dark as to which kinds of design decisions are really covered by Sony.

In my predictions for 2005, I predicted that the Court’s ruling would not provide clarity for future technologists. A vague split-the-baby decision is one way that could happen.

[To be safe, I’ll follow Post and belabor the obvious: a prediction is an assertion that something will happen; it doesn’t imply that the predicted event is or isn’t desirable.

I’m being a bit cagey about my own views here, partly because I’m going to be leading class discussions about Grokster soon, and some of my students are probably reading this. Sometimes students take positions that they think will please the professor, on the expectation that they’ll get higher grades just because they agree with the professor. I do my best to reward students for making creative and well-reasoned arguments, regardless of whether I agree with them. If anything, I try to lean the other way, and reward students for disagreeing with me, if they do it well.]

BitTrickle

Mike Godwin notes a serious error in the recent NYT future-of-TV article:

A more important problem with the article is that it gives a false impression of the normal user experience of BitTorrent:

Created by Bram Cohen, a 29-year-old programmer in Bellevue, Wash., BitTorrent breaks files hundreds or thousands of times bigger than a song file into small pieces to speed its path to the Internet and then to your computer. On the kind of peer-to-peer site that gave the music industry night sweats, an episode of “Desperate Housewives” that some fan copied and posted on the Internet can take hours to download; on BitTorrent, it arrives in minutes.

That hasn’t been my experience of BitTorrent, and I doubt many other ordinary users routinely experience the downloading of TV programs in “minutes.” On the off chance that BitTorrent speeds had suddenly improved since I had last used the application, I conducted an experiment – I downloaded the latest episode of Showtime’s program “Huff,” which stars Hank Azaria, within 24 hours of its having aired. (Downloading a program shortly after it has aired, when interest in the episode is at its peak, is the way to maximize download speed on BitTorrent.) The result? Even with the premium broadband service I have at my office, downloading Episode 13 of “Huff” – the final episode of the season – took six hours, with download speeds rarely exceeding 30KB/sec.

The NYT article seems to make a common error in thinking about BitTorrent. BitTorrent’s main effect is not to make downloads faster as the number of users increases, but to keep downloads from getting much slower. A simple model can explain why this is so. (As with all good models, this one gets the important things right but ignores some details.)

Let’s assume that a file is being offered by a server, and the server’s net connection is fast enough to transmit the entire file in S seconds. We’ll assume that N users want to download the file simultaneously, and that each user has a net connection that would take U seconds to transmit the entire file . (Generally, the user is willing to pay less for Net service than the server, so S < U.)

In an old-fashioned (pre-BitTorrent) system, all N copies of the file must go through the server's connection. That takes SN seconds. One copy goes through each user's connection, which takes U seconds. The two steps, taking times SN and U, can happen simultaneously, so the time to do both is equal to the larger of SN and U.

T_old(N) = max(SN, U)

If the file is popular (i.e., N is large), the SN term will dominate and the download time will be proportional to N. For popular files, adding users makes downloads slower.

In BitTorrent, the file only needs to go through the server’s connection once, which takes N seconds. On average, each block of the file must traverse a typical user’s link twice, since each block must be downloaded once, and BitTorrent expects each user to upload as many blocks as it downloads. So with BitTorrent, the total time to serve the N users is max(S, 2U). Recalling that S < U, we can see that

T_bt(N) = 2U

Now we can see the big win offered by BitTorrent: the download time is independent of the number of users (N), and of the speed of the server’s connection (S). Adding more users doesn’t make the download faster, but it doesn’t make it slower either. (It’s also worth noting that if N, the number of users, is small, BitTorrent is worse than old-fashioned systems, by a factor of two.)

With BitTorrent, the bottleneck is the end user’s net connection, only half of which can be used for BitTorrent downloads. (The other half must be used for uploads.) Most users’ connections, even the broadband ones, will take an awfully long time to download high-quality video content, BitTorrent or not.

Groundhog Day

Yesterday was Groundhog Day, the holiday. But for SunnComm, the embattled CD-DRM vendor, it may have been Groundhog Day, the movie, in which Bill Murray’s character is doomed to repeat the same unpleasant events until he learns certain lessons.

Yesterday SunnComm announced a new product. According to a Register story, the product fixes SunnComm’s infamous Shift Key problem. One has to wonder where the Reg got this idea, given that SunnComm’s press release is written oh so carefully to avoid saying that they have actually fixed the Shift Key problem.

The Shift Key problem, discovered by my student Alex Halderman, allows any computer user to defeat SunnComm’s previous anti-copying technology by holding down the computer’s Shift key while inserting the CD. (True story: When Alex first told me this, it took me a while to verify it because SunnComm’s technology had no effect at all on the first few computers I tried, even without use of the Shift Key.) In reality, the Shift Key behavior is not a “problem” but a security feature of Windows which keeps software on a CD from installing itself without the user’s permission.

Early CD-DRM technologies used passive measures, meaning that they encoded the music on the CD with deliberate errors. The goal was to find a kind of error that would be corrected (or not noticed) by ordinary CD players, but would cause computers’ CD drives to fail. The result would be that people could play the CD on an ordinary player but couldn’t rip it (or play it, for that matter) on a computer. This plan never quite worked, for two reasons. First, it relied on bugs in computer drives. Those bugs didn’t exist in some computer systems, and where they did exist they tended to be fixed. Second, some CD players are built from the same components as computer CD drives, so some encoded CDs were unplayable on some ordinary CD players.

Later CD-DRM technologies, like MediaMax CD3, the SunnComm system that suffered from the Shift Key problem, relied on active measures. The CD would contain software that would (try to) install itself the first time the user put the CD into the computer. This software would then actively interfere with attempts to rip the music from the CD. (The software would also provide some limited access to the music on the CD.) The problem with active measures is that they don’t work if the software never gets installed on the user’s computer, and there is no realistic way to force the user to install the software. The Shift Key trick was just one way for the user to prevent unauthorized software installation.

SunnComm’s new press release says that they are now adding passive measures (i.e., deliberate data encoding errors) to their MediaMax technology. They claim that, despite these deliberate errors, the CDs will be “100% playable in all consumer CD and DVD players”. This is very hard to believe. Mostly compatible, sure. 98% compatible, maybe. But 100% compatibility requires the CD to be playable on those CD or DVD players that are built with computer-drive components. How they could do that, while maintaining the required incompatibility with those same components in a computer, is a mystery.

Beyond this, the new passive measures, like the old ones, must rely on computer bugs that won’t exist on some systems, and will tend to be fixed on others. On many computers, then, the new passive measures will have no effect at all, leaving only the old active measures, which will fall to the Shift Key trick. Now we can see why SunnComm’s release stops short of claiming a Shift Key fix, and of claiming to prevent P2P infringement. We can see, too, why SunnComm’s investors and customers will be disappointed, yet again, when the product is released and its limitations become obvious.

And then, like the Bill Murray character, SunnComm will be doomed to relive the cycle yet again.

More Trouble for Network Monitors

A while back I wrote about a method (well known to cryptography nerds) to frustrate network monitoring. It works by breaking up a file into several shares, in such a way that any individual share, and indeed any partial subset of the shares, looks entirely random, but if you have all of the shares then you can add them together to get back the original file. Today I want to develop this idea a bit further.

The trick I discussed before sent the shares one at a time, perhaps interspersed with other traffic, so that a network monitor would have to gather and store all of the shares, and know that they were supposed to go together, in order to know what was really being transmitted. The network monitor couldn’t just look at one message at a time. In other words, the shares were transmitted from the same place, but at different times.

It turns out that you can also transmit the shares from different places. The idea is to divide a file into shares, and put one share on server A, another on server B, another on server C, and so on. Somebody who wanted the file (and who knew how it was distributed) would go to all of the servers and get one share from each, and then combine them. To figure out what was going on, a network monitor would have to be monitoring the traffic from all of the servers, and it would have to know how to put the seemingly random shares together. The network monitor would have to gather information from many places and bring it together. That’s difficult, especially if there are many servers involved.

If the network monitor did figure out what is going on, then it would know which servers are participating in the scheme. If Alice and Bob were both publishing shares of the file, then the network monitor would blame them both.

Congratulations on making it this far. Now here’s the cool part. Suppose that Alice is already publishing some file A that looks random. Now Bob wants to publish a file F using two-way splitting; so Bob publishes B = F-A, so that people can add A and B to get back F. Now suppose the network monitor notices that the two random-looking files A and B add up to F; so the network monitor tries to blame Alice and Bob. But Alice says no – she was just publishing the random-looking file A, and Bob came along later and published F-A. Alice is off the hook.

But note that Bob can use the same excuse. He can claim that he published the random-looking file B, and Alice came along later and published F-B. To the monitor, A and B look equally random. So the monitor can’t tell who is telling the truth. Both Alice and Bob have plausible deniability – Alice because she is actually innocent, and Bob because the network monitor can’t distinguish his situation from Alice’s. (Of course, this also works with more than two people. There could be many innocent Alices and one tricky Bob.)

Bob faces some difficulties in pulling off this trick. For example, the network monitor might notice that Alice published her file before Bob published his. Bob doesn’t have a foolproof scheme for distributing files anonymously – at least not yet. But stay tuned for my next post about this topic.