November 21, 2024

Major Intrusion at MediaDefender

MediaDefender, a company providing technical countermeasures and intelligence gathering for copyright owners, suffered a severe cyber-intrusion over the past year or so. This was revealed last week when the intruders released what appears to be most of MediaDefender’s email from this calendar year, along with the source code for its products, and even one of the company’s VoIP phone calls.

Published analyses of the released material mostly confirm what was already suspected, that MediaDefender’s technical tactics had mixed effectiveness, and that the company may have edged across the ethical (and possibly legal) line by launching active cyber-attacks on suspected infringers.

The intruders, on the other hand, went far across the line, committing serious crimes. If caught, they’ll face severe punishment, and rightly so. No excuse can justify this kind of break-in.

Nor have the intruders struck a blow for online freedom. Instead, they have helped their opponents paint a (misleading) picture in which righteous copyright owners are under attack by a small cabal of scofflaw super-hackers.

Expect a backlash. And the main victims of that backlash, as usual, will be ordinary users who aren’t out to hurt anybody but just want some way to coexist peacefully with copyright owners.

[Correction (Sept. 25): Corrected the first paragraph, which previously said voice mail had been captured, to say that a VoIP phone call was captured.]

Why Was Skype Offline?

Last week Skype, the popular, free Net telephony service, was unavailable for a day or two due to technical problems. Failures of big systems are always interesting and this is no exception.

We have only limited information about what went wrong. Skype said very little at first but is now opening up a little. Based on their description, it appears that the self-organization mechanism in Skype’s peer-to-peer network became unstable. Let’s unpack that to understand what it means, and what it can tell us about systems like this.

One of the surprising facts about big information systems is that the sheer scale of a system changes the engineering problems you face. When a system grows from small to large, the existing problems naturally get harder. But you also see entirely new problems that didn’t even exist at small scale – and, worse yet, this will happen again and again as your system keeps growing.

Skype uses a peer-to-peer organization, in which the traffic flows through ordinary users’ computers rather than being routed through a set of central servers managed by Skype itself. The advantage of exploiting users’ computers is that they’re available at no cost and, conveniently, there are more of them to exploit when there are more users requesting service. The disadvantage is that users’ computers tend to reboot or go offline more than dedicated servers would.

To deal with the ever-changing population of user computers, Skype has to use a clever self-organization algorithm that allows the machines to organize themselves without relying (more than a tiny bit) on a central authority. Self-organization has two goals: (1) the system must respond quickly to changed conditions to get back into a good configuration soon, and (2) the system must maintain stability as conditions change. These two goals aren’t entirely contradictory, but they are at least in tension. Responding quickly to changes makes it difficult to maintain stability, and the system must be engineered to make this tradeoff wisely in a wide range of conditions. Getting this right in a huge P2P system like Skype is tricky.

Which brings us to the story of last week’s failure, as described by Skype. On Tuesday August 14, Microsoft released a new set of patches to Windows, according to their normal monthly cycle. Many Windows machines downloaded the patch, installed it, and then rebooted. Each such machine would leave the Skype network when it shut down, then rejoin after booting. So the effect of Microsoft’s patch release was to increase the turnover in Skype’s network.

The result, Skype says, is that the network became unstable as the respond-quickly mechanism outran the maintain-stability mechanism; and the problem snowballed as the growing instability caused ever stronger (but poorly aimed) responses. The Skype service was essentially unavailable for a day or two starting on Thursday August 16, until the company could track down the problem and fix a code bug that it said contributed to the problem.

The biggest remaining mystery is why the problem took so long to develop. Microsoft issued the patch on Tuesday, and Skype didn’t get into deep trouble until Thursday. We can explain away some of the delay by noting that Windows machines might take up to a day to download the patch and reboot, but this still means it took Skype’s network at least a day to melt down. I’d love to know more about how this happened.

I would hesitate to draw too many broad conclusions from a single failure like this. Large systems of all kinds, whether centralized or P2P, must fight difficult stability problems. When a problem like this does occur, it’s a useful natural experiment in how large systems behave. I only hope Skype has more to say about what went wrong.

Inside Clouseau's Brain: Dissecting SafeMedia's Outlandish Technical Claims

I wrote in April about the over-the-top marketing claims of the “anti-piracy” company SafeMedia. (See Is SafeMedia a Parody?) The company’s marketing materials claim that its comically named product, “Clouseau,” can do what is provably impossible. Having both a professional and personal interest in how such claims come to be made, I wanted to learn more about how Clouseau actually worked. But the company, unsurprisingly, did not provide that information.

Now we have two more clues. First, SafeMedia founder Safwat Fahmy was actually invited to testify before a congressional hearing, where he provided written testimony. Second, I got hold of a white paper that SafeMedia salespeople are giving to prospective customers. Both documents give some technical information about Clouseau.

[CORRECTION (June 26): Mr. Fahmy was not actually invited to testify, and he did not appear before the committee, according to the committee’s own web site about the hearing. All he did was submit written testimony, which absolutely anyone is allowed to do. I was misled by a SafeMedia press release. I should have known better than to rely on those guys.]

The documents contradict each other in several ways. For example, Mr. Fahmy’s testimony says that Clouseau “detects and prohibits illegal P2P traffic while allowing the passage of legal P2P such as BitTorrent” (page 5). But the white paper says that BitTorrent is illegal and was blocked every time by Clouseau in their tests (page 6 and Appendix A).

Similarly, the white paper says, “In a series of tests conducted by us, Clouseau did not block any normal packets including web HTTP(S) and VPN (ipSec and PPTP).” (page 5) (HTTPS and VPN protocols are standard ways of using encryption to hide the content of communications.) But Mr. Fahmy’s congressional testimony says that “Clouseau is fully effective at forensically discriminating between legal and illegal P2P traffic with no false positives … whether encrypted or not” (page 7) which implies that it must block some HTTPS and VPN traffic.

One thing the documents seem to agree on is that Clouseau operates by trying to detect and block certain protocols, rather than looking at the material being transmitted. That is, it doesn’t look for infringing content but instead declares certain protocols to be illegitimate and then tries to block them. Which is a problematic design because many protocols are used for both infringing and noninfringing purposes. Some protocols, like BitTorrent see lots of noninfringing use and lots of infringing use. So Clouseau will get many cases wrong, whether it blocks BitTorrent or not – a problem the company apparently gets around by claiming to block BitTorrent and claiming not to block it.

How does the company square its protocol-blocking design with its claim to block illegal content with complete accuracy? Apparently they just redefine the term “illegal” to be co-extensive with the set of things their product blocks. In other words, the company’s legal claims seem to be just as implausible as its technical claims.

[UPDATE (Oct. 5, 2007): I hear rumors that SafeMedia is telling people that they offered me or my group access to a Clouseau device to study, but we refused. For the record, this is false.]

DRM Wars: The Next Generation

Last week at the Usenix Security Symposium, I gave an invited talk, with the same title as this post. The gist of the talk was that the debate about DRM (copy protection) technologies, which has been stalemated for years now, will soon enter a new phase. I’ll spend this post, and one or two more, explaining this.

Public policy about DRM offers a spectrum of choices. On one end of the spectrum are policies that bolster DRM, by requiring or subsidizing it, or by giving legal advantages to companies that use it. On the other end of the spectrum are policies that hinder DRM, by banning or regulating it. In the middle is the hands-off policy, where the law doesn’t mention DRM, companies are free to develop DRM if they want, and other companies and individuals are free to work around the DRM for lawful purposes. In the U.S. and most other developed countries, the move has been toward DRM-bolstering laws, such as the U.S. DMCA.

The usual argument in favor of bolstering DRM is that DRM retards peer-to-peer copyright infringement. This argument has always been bunk – every worthwhile song, movie, and TV show is available via P2P, and there is no convincing practical or theoretical evidence that DRM can stop P2P infringement. Policymakers have either believed naively that the next generation of DRM would be different, or accepted vague talk about speedbumps and keeping honest people honest.

At last, this is starting to change. Policymakers, and music and movie companies, are starting to realize that DRM won’t solve their P2P infringement problems. And so the usual argument for DRM-bolstering laws is losing its force.

You might expect the response to be a move away from DRM-bolstering laws. Instead, advocates of DRM-bolstering laws have switched to two new arguments. First, they argue that DRM enables price discrimination – business models that charge different customers different prices for a product – and that price discrimination benefits society, at least sometimes. Second, they argue that DRM helps platform developers lock in their customers, as Apple has done with its iPod/iTunes products, and that lock-in increases the incentive to develop platforms. I won’t address the merits or limitations of these arguments here – I’m just observing that they’re replacing the P2P piracy bogeyman in the rhetoric of DMCA boosters.

Interestingly, these new arguments have little or nothing to do with copyright. The maker of almost any product would like to price discriminate, or to lock customers in to its product. Accordingly, we can expect the debate over DRM policy to come unmoored from copyright, with people on both sides making arguments unrelated to copyright and its goals. The implications of this change are pretty interesting. They’ll be the topic of my next post.

Conscientious Objection in P2P

One argument made against using P2P systems like Grokster was that by using them you might participate in the distribution of bad content such as infringing files, hate speech, or child porn. If you use the Web to distribute or read content, you play no part in distributing anything you find objectionable – you only distribute a file if you choose to do so. P2P, the argument goes, is different.

Today I want to consider what you can do if you want to use P2P to access files, but you want to avoid participating in any way in the distribution of bad files. When I say a file is “bad” I mean only that you, personally, have a strong moral objection to it, so that you do not want to participate in its distribution. Different people will have different ideas about which files (if any) are bad. Saying that a file is bad is not the same as saying that it should be banned or that others should not be allowed to distribute it – choosing not to do something yourself is not the same as banning others from doing it. So this is not about censorship.

The original design of BitTorrent was friendly to those who wanted to avoid distributing bad files. You could distribute any files you liked, and by default you would automatically redistribute any file that you had downloaded. But you wouldn’t find yourself distributing any bad files (unless you downloaded bad files yourself), or even helping anybody find bad files. Others could read or publish what they wanted, but you wouldn’t help them unless you wanted to.

This is unlike Grokster or Gnutella, where your computer would (by default at least) help to construct an index that would help people find files of all types, including some bad files. You might think that’s fine and choose to participate in it, but then again you might be unhappy if the proportion of bad files that you were helping to index was too high for your taste, or their content too vile. Because BitTorrent didn’t have a built-in index, you could use it without running into this issue.

But then, about ten months ago, a new “trackerless” version of BitTorrent came along. This version had a big distributed index, provided cooperatively by the computers of everybody who was using BitTorrent. After this change, if you were using BitTorrent, you were helping to index files. (Strictly speaking, you would be providing “tracker information” for the files; I’m using “index” as shorthand.) Some of those files might be bad.

To be precise, you would be helping to index a small, and randomly chosen, subset of all the BitTorrent files in the world. And if it came to your attention that one of those files was bad, you could choose not to participate in indexing it, by simply refusing to respond to index queries about that file. Standard BitTorrent software doesn’t support this refusal tactic, but the tactic is possible given how the BitTorrent protocol is designed.

Your refusal to provide index information for a file would not, by itself, make the file unavailable. BitTorrent stores index information redundantly, so other people could answer the index queries that you refused to answer. Only if all (or too many) of the people assigned to index a file refused to do so would that file disappear.

If lots of people started refusing to index files they thought were bad, this would amount to a kind of jury system, in which each file was assigned to a random set of BitTorrent “citizens” who voted (by indexing, or refusing to do so) on whether the file should be available. If too many jurors voted to suppress a file, it would disappear.

By now, some of you are jumping up and down, shaking your fingers at me. This is an affront to free speech, you’re saying – every file should be available to everybody. To which I reply: don’t blame me. This is the way BitTorrent is designed. By switching to the trackerless protocol, BitTorrent’s designers created this possibility. And the problem – if you consider it one – can be fixed. How to fix it is a topic for another day.