November 29, 2024

BitTrickle

Mike Godwin notes a serious error in the recent NYT future-of-TV article:

A more important problem with the article is that it gives a false impression of the normal user experience of BitTorrent:

Created by Bram Cohen, a 29-year-old programmer in Bellevue, Wash., BitTorrent breaks files hundreds or thousands of times bigger than a song file into small pieces to speed its path to the Internet and then to your computer. On the kind of peer-to-peer site that gave the music industry night sweats, an episode of “Desperate Housewives” that some fan copied and posted on the Internet can take hours to download; on BitTorrent, it arrives in minutes.

That hasn’t been my experience of BitTorrent, and I doubt many other ordinary users routinely experience the downloading of TV programs in “minutes.” On the off chance that BitTorrent speeds had suddenly improved since I had last used the application, I conducted an experiment – I downloaded the latest episode of Showtime’s program “Huff,” which stars Hank Azaria, within 24 hours of its having aired. (Downloading a program shortly after it has aired, when interest in the episode is at its peak, is the way to maximize download speed on BitTorrent.) The result? Even with the premium broadband service I have at my office, downloading Episode 13 of “Huff” – the final episode of the season – took six hours, with download speeds rarely exceeding 30KB/sec.

The NYT article seems to make a common error in thinking about BitTorrent. BitTorrent’s main effect is not to make downloads faster as the number of users increases, but to keep downloads from getting much slower. A simple model can explain why this is so. (As with all good models, this one gets the important things right but ignores some details.)

Let’s assume that a file is being offered by a server, and the server’s net connection is fast enough to transmit the entire file in S seconds. We’ll assume that N users want to download the file simultaneously, and that each user has a net connection that would take U seconds to transmit the entire file . (Generally, the user is willing to pay less for Net service than the server, so S < U.)

In an old-fashioned (pre-BitTorrent) system, all N copies of the file must go through the server's connection. That takes SN seconds. One copy goes through each user's connection, which takes U seconds. The two steps, taking times SN and U, can happen simultaneously, so the time to do both is equal to the larger of SN and U.

T_old(N) = max(SN, U)

If the file is popular (i.e., N is large), the SN term will dominate and the download time will be proportional to N. For popular files, adding users makes downloads slower.

In BitTorrent, the file only needs to go through the server’s connection once, which takes N seconds. On average, each block of the file must traverse a typical user’s link twice, since each block must be downloaded once, and BitTorrent expects each user to upload as many blocks as it downloads. So with BitTorrent, the total time to serve the N users is max(S, 2U). Recalling that S < U, we can see that

T_bt(N) = 2U

Now we can see the big win offered by BitTorrent: the download time is independent of the number of users (N), and of the speed of the server’s connection (S). Adding more users doesn’t make the download faster, but it doesn’t make it slower either. (It’s also worth noting that if N, the number of users, is small, BitTorrent is worse than old-fashioned systems, by a factor of two.)

With BitTorrent, the bottleneck is the end user’s net connection, only half of which can be used for BitTorrent downloads. (The other half must be used for uploads.) Most users’ connections, even the broadband ones, will take an awfully long time to download high-quality video content, BitTorrent or not.

Groundhog Day

Yesterday was Groundhog Day, the holiday. But for SunnComm, the embattled CD-DRM vendor, it may have been Groundhog Day, the movie, in which Bill Murray’s character is doomed to repeat the same unpleasant events until he learns certain lessons.

Yesterday SunnComm announced a new product. According to a Register story, the product fixes SunnComm’s infamous Shift Key problem. One has to wonder where the Reg got this idea, given that SunnComm’s press release is written oh so carefully to avoid saying that they have actually fixed the Shift Key problem.

The Shift Key problem, discovered by my student Alex Halderman, allows any computer user to defeat SunnComm’s previous anti-copying technology by holding down the computer’s Shift key while inserting the CD. (True story: When Alex first told me this, it took me a while to verify it because SunnComm’s technology had no effect at all on the first few computers I tried, even without use of the Shift Key.) In reality, the Shift Key behavior is not a “problem” but a security feature of Windows which keeps software on a CD from installing itself without the user’s permission.

Early CD-DRM technologies used passive measures, meaning that they encoded the music on the CD with deliberate errors. The goal was to find a kind of error that would be corrected (or not noticed) by ordinary CD players, but would cause computers’ CD drives to fail. The result would be that people could play the CD on an ordinary player but couldn’t rip it (or play it, for that matter) on a computer. This plan never quite worked, for two reasons. First, it relied on bugs in computer drives. Those bugs didn’t exist in some computer systems, and where they did exist they tended to be fixed. Second, some CD players are built from the same components as computer CD drives, so some encoded CDs were unplayable on some ordinary CD players.

Later CD-DRM technologies, like MediaMax CD3, the SunnComm system that suffered from the Shift Key problem, relied on active measures. The CD would contain software that would (try to) install itself the first time the user put the CD into the computer. This software would then actively interfere with attempts to rip the music from the CD. (The software would also provide some limited access to the music on the CD.) The problem with active measures is that they don’t work if the software never gets installed on the user’s computer, and there is no realistic way to force the user to install the software. The Shift Key trick was just one way for the user to prevent unauthorized software installation.

SunnComm’s new press release says that they are now adding passive measures (i.e., deliberate data encoding errors) to their MediaMax technology. They claim that, despite these deliberate errors, the CDs will be “100% playable in all consumer CD and DVD players”. This is very hard to believe. Mostly compatible, sure. 98% compatible, maybe. But 100% compatibility requires the CD to be playable on those CD or DVD players that are built with computer-drive components. How they could do that, while maintaining the required incompatibility with those same components in a computer, is a mystery.

Beyond this, the new passive measures, like the old ones, must rely on computer bugs that won’t exist on some systems, and will tend to be fixed on others. On many computers, then, the new passive measures will have no effect at all, leaving only the old active measures, which will fall to the Shift Key trick. Now we can see why SunnComm’s release stops short of claiming a Shift Key fix, and of claiming to prevent P2P infringement. We can see, too, why SunnComm’s investors and customers will be disappointed, yet again, when the product is released and its limitations become obvious.

And then, like the Bill Murray character, SunnComm will be doomed to relive the cycle yet again.

More Trouble for Network Monitors

A while back I wrote about a method (well known to cryptography nerds) to frustrate network monitoring. It works by breaking up a file into several shares, in such a way that any individual share, and indeed any partial subset of the shares, looks entirely random, but if you have all of the shares then you can add them together to get back the original file. Today I want to develop this idea a bit further.

The trick I discussed before sent the shares one at a time, perhaps interspersed with other traffic, so that a network monitor would have to gather and store all of the shares, and know that they were supposed to go together, in order to know what was really being transmitted. The network monitor couldn’t just look at one message at a time. In other words, the shares were transmitted from the same place, but at different times.

It turns out that you can also transmit the shares from different places. The idea is to divide a file into shares, and put one share on server A, another on server B, another on server C, and so on. Somebody who wanted the file (and who knew how it was distributed) would go to all of the servers and get one share from each, and then combine them. To figure out what was going on, a network monitor would have to be monitoring the traffic from all of the servers, and it would have to know how to put the seemingly random shares together. The network monitor would have to gather information from many places and bring it together. That’s difficult, especially if there are many servers involved.

If the network monitor did figure out what is going on, then it would know which servers are participating in the scheme. If Alice and Bob were both publishing shares of the file, then the network monitor would blame them both.

Congratulations on making it this far. Now here’s the cool part. Suppose that Alice is already publishing some file A that looks random. Now Bob wants to publish a file F using two-way splitting; so Bob publishes B = F-A, so that people can add A and B to get back F. Now suppose the network monitor notices that the two random-looking files A and B add up to F; so the network monitor tries to blame Alice and Bob. But Alice says no – she was just publishing the random-looking file A, and Bob came along later and published F-A. Alice is off the hook.

But note that Bob can use the same excuse. He can claim that he published the random-looking file B, and Alice came along later and published F-B. To the monitor, A and B look equally random. So the monitor can’t tell who is telling the truth. Both Alice and Bob have plausible deniability – Alice because she is actually innocent, and Bob because the network monitor can’t distinguish his situation from Alice’s. (Of course, this also works with more than two people. There could be many innocent Alices and one tricky Bob.)

Bob faces some difficulties in pulling off this trick. For example, the network monitor might notice that Alice published her file before Bob published his. Bob doesn’t have a foolproof scheme for distributing files anonymously – at least not yet. But stay tuned for my next post about this topic.

Show Us the Numbers

Today brings yet another story about how Hollywood’s finances are better than ever. Ross Johnson’s story (“Video Sales Abroad Are Good News in Hollywood. Shhh.“) in today’s New York Times tells us that the studios are keeping their overseas DVD sales secret, so as not to interfere with the industry’s tradition of lowballing its revenue.

“For a long time, the film business was a single-digit business on investment return,” said Charles Roven, the producer of “Batman Begins” from Warner Brothers, a division of Time Warner. “Now, because of home video, it’s a low double-digit business, and the studios want to make sure it doesn’t go back into the single-digit business.”

In the past, lowballing has enabled the industry to limit its payouts to stars whose contracts call for a share of the profits. As the story reports, that battle goes on.

These days, of course, surging profits would be inconvenient in another way. They would undercut the industry’s rent-seeking in Washington, which relies on a narrative in which technology destroys the industry’s revenue stream. If the technology problem is really as bad as the industry says, then it ought to show up in the sales numbers.

The music industry has opened its books, reporting sales and revenue numbers that fell for several years before rebounding slightly in 2004. By all reports, the movie industry is still more profitable than ever.

It may turn out that the net effect of technology on the industry is neutral, or even positive. If so, then no expansion of copyright law is needed, and a mild contraction may even be in order. Remember, the goal of copyright is not to maximize the profits of any one industry, but to foster creativity by regulating just enough to ensure an adequate incentive to create. If the industry wants to argue that incentives are inadequate now, or will be in the future, then it will have to show us the numbers.

The stars fight lowballing by demanding a detailed audit of industry revenue reports. We should demand no less.

Review of MPAA's "Parent File Scan" Software

Yesterday the MPAA announced the availability of a new software tool called Parent File Scan. I decided to download it and try it out. Here’s my review.

According to an MPAA site,

Parent File Scan software helps consumers check whether their computers have peer-to-peer software and potentially infringing copies of motion pictures and other copyrighted material. Removing such material can help consumers avoid problems frequently caused by peer-to-peer software. The information generated by the software is made available only to the program’s user, and is not shared with or reported to the MPAA or another body.

In practice, if there are music files on a computer, no software tool can tell whether they’re legal or illegal, because there is no way to tell whether the files came from ripping the consumer’s own CDs (which is legal) or from infringing P2P downloading (which is illegal). Saying the music files on consumer computers are “potentially infringing” will probably cause some people to delete files that are perfectly legal. The implication that removing music files from your computer “can help [you] avoid problems frequently caused by peer-to-peer software” seems misleading. Of course, it’s totally correct that removing P2P apps will eliminate any problems caused by P2P apps.

The Parent File Scan software itself comes from a company called DtecNet. You download and install the software, click through a standard-looking EULA, and you’re ready to go. When you tell it to scan, it searches your hard drive for files in common audio or video formats, and for P2P apps. On my machine, it seemed to find all of the audio files (all legal). It failed to find any video files, which I think is correct. The only P2P app on my machine was an old version of Napster (which was never used to infringe). Parent File Scan failed to find Napster, but it’s worth noting that the old Napster version in question is now utterly useless.

At the end of the scan, if you have any P2P apps, Parent File Scan offers to remove them. Based on the documentation, it appears that the removal is done by invoking the P2P app’s own removal program; the documentation warns that there might not be a removal program, and it might not remove everything that came with the P2P app (i.e., spyware).

Parent File Scan also lists the audio and video files it found. It discloses very clearly (annoyingly often, in fact) that it has no way of knowing whether the files are legal or illegal. Here’s a typical message:

The program does not distinguish between legal and illegal copies. It is up to the user to determine whether the files found by the program have been acquired legally, or if the material should be deleted.

In the post-scan display, each audio/video file has a checkbox which you can check to designate the file for deletion. The default is to delete nothing. I deleted a few old files that I didn’t want anymore, and everything seemed to work correctly.

All in all, the program seems to do its job well. The user interface is clear and straightforward, and does not try to scare or mislead the user. Not everybody will want this a program like this, but those who do will probably be happy with Parent File Scan.

UPDATED (11:15 PM): Added the word “infringing” before “P2P” in the “In practice …” paragraph, to eliminate the (false) implication that all P2P downloading is illegal.