January 21, 2025

Cost Tradeoffs of P2P

On Thursday, I jumped in to a bloggic discussion of the tradeoffs between centrally-controlled and peer-to-peer design strategies in distributed systems. (See posts by Randy Picker (with comments from Tim Wu and others), Lior Strahilevitz, me, and Randy Picker again.)

We’ve agreed, I think, that large-scale online services will be designed as distributed systems, and the basic design choice is between a centrally-controlled design, where most of the work is done by machines owned by a single entity, and a peer-to-peer design, where most of the work is done by end users’ machines. Google is a typical centrally-controlled design. BitTorrent is a typical P2P design.

The question in play at this point is when the P2P design strategy has a legitimate justification. Which justifications are “legitimate”? This is a deep question in general, but for our purposes it’s enough to say that improving technical or economic efficiency is a legitimate justification, but frustrating enforcement of copyright is not. Actions that have legitimate justifications may also have harmful side-effects. For now I’ll leave aside the question of how to account for such side-effects, focusing instead on the more basic question of when there is a legitimate justification at all.

Which design is more efficient? Compared to central control, P2P has both disadvantages and advantages. The main disadvantage is that in a P2P design, the computers participating in the system are owned by people who have differing incentives, so they cannot necessarily be trusted to work toward the common good of the system. For example, users may disconnect their machines when they’re not using the system, or they may “leech” off the system by using the services of others but refusing to provide services. It’s generally harder to design a protocol when you don’t trust the participants to play by the protocol’s rules.

On the other hand, P2P designs have three main efficiency advantages. First, they use cheaper resources. Users pay about the same price per unit of computing and storage as a central provider would pay. But the users’ machines a sunk cost – they’re already bought and paid for, and they’re mostly sitting idle. The incremental cost of assigning work to one of these machines is nearly zero. But in a centrally controlled system, new machines must be bought, and reserved for use in providing the service.

Second, P2P deals more efficiently with fluctuations in workload. The traffic in an online system varies a lot, and sometimes unpredictably. If you’re building a centrally-controlled system, you have to make sure that extra resources are available to handle surges in traffic; and that costs money. P2P, on the other hand, has the useful property that whenever you have more users, you have more users’ computers (and network connections) to put to work. The system’s capacity grows automatically whenever more capacity is needed, so you don’t have to pay extra for surge-handling capacity.

Third, P2P allows users to subsidize the cost of running the system, by having their computers do some of the work. In theory, users could subsidize a centrally-controlled system by paying money to the system operator. But in practice, monetary transfers can bring significant transaction costs. It can be cheaper for users to provide the subsidy in the form of computing cycles than in the form of cash. (A full discussion of this transaction cost issue would require more space – maybe I’ll blog about it someday – but it should be clear that P2P can reduce transaction costs at least sometimes.)

Of course, this doesn’t prove that P2P is always better, or that any particular P2P design in use today is motivated only by efficiency considerations. What it does show, I think, is that the relative efficiency of centrally-controlled and P2P designs is a complex and case-specific question, so that P2P designs should not be reflexively labeled as illegitimate.

"Centralized" Sites Not So Centralized After All

There’s an conversation among Randy Picker, Tim Wu, and Lior Strahilevitz over the U. Chicago Law School Blog about the relative merits of centralized and peer-to-peer designs for file distribution. (Picker post with Wu comments; Strahilevitz post) Picker started the discussion by noting that photo sharing sites like Flickr use a centralized design, rather than peer-to-peer. He questioned whether P2P design made sense, except as a way to dodge copyright enforcement. Wu pointed out that P2P designs can distribute large files more efficiently, as in BitTorrent. Strahilevitz pointed out that P2P designs resist censorship more effectively than centralized ones.

There’s a subtlety hiding here, and in most cases where people compare centralized services to distributed ones: from a technology standpoint, the “centralized” designs aren’t really centralized.

A standard example is Google. It’s presented to users as a single website, but if you look under the hood you’ll see that it’s really implemented by a network of hundreds of thousands of computers, distributed in data centers around the world. If you direct your browser to www.google.com, and I direct my browser to the same URL, we’ll almost certainly interact with entirely different sets of computers. The unitary appearance of the Google site is an illusion maintained by technical trickery.

The same is almost certainly true of Flickr, though on a smaller scale. Any big service will have to use a distributed architecture of some sort.

So what distinguishes “centralized” sites from P2P designs? I see two main differences.

(1) In a “centralized” site, all of the nodes in the distributed system are controlled by the same entity; in a P2P design, most nodes are controlled by end users. There is a technical tradeoff here. Centralized control offers some advantages, but they sacrifice the potential scalability that can come from enlisting the multitude of end user machines. (Users own most of the machines in the world, and those machines are idle most of the time – that’s a big untapped resource.) Depending on the specific application, one strategy or the other might offer better reliability.

(2) In a “centralized” site, the system interacts with the user through browser technologies; in a P2P design, the user downloads a program that offers a more customized user interface. There is another technical tradeoff here. Browsers are standardized and visiting a website is less risky for the user than downloading software, but a custom user interface sometimes serves users better.

The Wu and Strahilevitz argument focused on the first difference, which does seem the more important one these days. The bottom line, I think, is that P2P-style designs that involve end users’ machines make the most sense when scalability is at a premium, or when such designs are more robust.

But it’s important to remember that the issue isn’t whether the services uses lots of distributed computers. The issue is who controls those computers.

Cellphone Denial of Service

A new paper by Enck, Traynor, McDaniel, and La Porta argues that cellphone networks that support SMS, a technology for sending short text messages to phones, are subject to denial of service attacks. The researchers claim that a clever person with a fast home broadband connection could potentially block cell phone calling in Manhattan or Washington, DC.

A mobile phone network divides up the world up into cells. A phone connects to the radio tower that serves the cell it is currently in. Within each cell, the system uses a set of radio channels to carry voice conversations, and one radio channel for control. The control channel is used to initiate calls; but once initiated, a call switches over to one of the voice channels.

It turns out that the control channels are also used to deliver SMS messages to phones in the cell. If too many SMS messages show up in the same cell all at once, they can monopolize that cell’s control channel, leaving no openings on the control channel left over for initiating calls. The result is that a large enough burst of SMS messages effectively blocks call initiation in a cell.

The paper discusses how an attacker create a large enough flurry of SMS messages, including how he might figure out which phones are likely to be active in the target area. (An SMS message only uses a cell’s control channel if the message is direct to a phone that is currently in the cell.)

Today’s New York Times makes a big deal out of this, but I don’t think it’s as important as the Times implies. For one thing, it’s relatively easy to fix, for example by reserving a certain fraction of each cell’s control channel for call initiation.

Others have speculated that this problem must already have been fixed, because it seems implausible that such a simple flaw would exist in an advanced network run by a large, highly competent provider. I wouldn’t draw that conclusion, though. It’s in the nature of security that there are a great many mistakes a system designer can make, each of which seems obvious once you think of it. A big part of securing a complicated system is simply thinking up all of the straightforward mistakes you might have made, and verifying that you haven’t made them. Big systems built by competent designers have seemingly obvious flaws all the time.

(Putting on my professor’s hat, I’m obliged to point out that systems that are small and easily modeled are best handled by building formal proofs of security; but that’s a nonstarter for anything as complex as a cell network. (In case you’re wondering what my professor’s hat looks like, it’s purple and eight-sided, with a little gold tassel.))

The biggest surprise to me is how few SMS messages it takes to clog the system. The paper estimates that hundreds of SMS messages per second, sent in the right way, are probably enough to block cell calling in a major provider’s network in all of Manhattan, or all of Washington, DC. Given those numbers, I’m surprised that the networks aren’t congested all the time, just based on ordinary traffic. I guess people use SMS less than one might have thought.

eDonkey Seeks Record Industry Deal

Derek Slater points to last week’s Senate hearing testimony by Sam Yagan, President of MetaMachine, the distributor of the popular eDonkey peer-to-peer file sharing software.

The hearing’s topic was “Protecting Copyright and Innovation in a Post-Grokster World”. Had the Supreme Court drawn a clearer legal line in its Grokster decision, we wouldn’t have needed such a hearing. But the Court instead chose to create a vague new inducement standard that will apparently ensnare Grokster, but that leaves us in the dark about the boundaries of copyright liability for distributors of file sharing technologies.

It has long been rumored that the record and movie industries avoided dealmaking with P2P companies during the Grokster case, because deals would undercut the industry’s efforts to paint P2P as an outlaw technology. Yagan asserts that these rumors are true:

[MetaMachine] held multiple meetings with major music labels and publishers as well as movie studios, and at one point, received verbal commitments from major entertainment firms to proceed with proof-of-concept technical testing and market trials.

The firms later rescinded these approvals, however, with the private explanation that to proceed in collaboration with eDonkey on a business solution, or even to appear to be doing so, could jeopardize the case of the petitioners in the pending MGM v. Grokster litigation.

An obvious question now is whether the record industry will sue MetaMachine on a Grokster-based inducement theory. The industry did send a cease-and-desist letter to MetaMachine, along with several other P2P vendors. Yagan asserted that MetaMachine could successfully defend a recording industry lawsuit. I don’t know whether that’s right – I don’t have access to the facts upon which a court would decide whether MetaMachine has induced infringement – but it’s at least plausible.

Whether MetaMachine could actually win such a suit is irrelevant, though, because the company can’t afford to fight a suit, and can’t afford to risk the very high statutory damages it would face if it lost. So, Yagan said, MetaMachine has no choice but to make a deal now, on the record industry’s terms.

Because we cannot afford to fight a lawsuit – even one we think we would win – we have instead prepared to convert eDonkey’s user base to an online content retailer operating in a “closed” P2P environment. I expect such a transaction to take place as soon as we can reach a settlement with the RIAA. We hope that the RIAA and other rights holders will be happy with our decision to comply with their request and will appreciate our cooperation to convert eDonkey users to a sanctioned P2P environment.

MetaMachine has decided, in other words, that it is infeasible to sell P2P tools without the record industry’s blessing. The Supreme Court said pretty clearly in its Grokster decision that record industry approval is not a necessary precondition for a P2P technology to be legal. But record industry approval may be a practical necessity nonetheless. Certainly, the industry is energetically spreading the notion that when it comes to P2P systems, “legitimate” is a synonym for “approved by the record industry”.

But just when we’re starting to feel sympathy for Yagan and MetaMachine as victims of copyright overreaching, he does some overreaching of his own. eDonkey faces competition from a compatible, free program called eMule; and Yagan wants eMule shut down.

Not only have the eMule distributors adopted a confusingly similar name, but they also designed their application to communicate with our eDonkey clients using our protocol.

In other words, eMule clients basically camouflage themselves as eDonkey clients in order to download files from eDonkey users. As a result, eMule computers actually usurp some of the bandwidth that should be allocated to eDonkey file transfers, degrading the experience of eDonkey users.

Ignoring the loaded language, what’s happening here is that the eMule program is compatible with eDonkey, so that eMule users and eDonkey users can share files with each other. This isn’t illegal, and Yagan offers no argument that it is. Indeed, his testimony is artfully worded to give the impression, without actually saying so, that creating compatible software without permission is clearly illegal. I guess he figures that if we’re going to have copyright maximalism, we might as well have it for everybody.

There’s more interesting stuff in Yagan’s testimony, but I’m out of space here. Mark Lemley’s testimony is interesting too, offering some thoughful suggestions.

Net Governance Debate Heats Up

European countries surprised the U.S. Wednesday by suggesting that an international body rather than the U.S. government should have ultimate control over certain Internet functions. According to Tom Wright’s story in the International Herald Tribune,

The United States lost its only ally [at the U.N.’s World Summit on the Information Society] late Wednesday when the EU made a surprise proposal to create an intergovernmental body that would set principles for running the Internet. Currently the U.S. Commerce Department approves changes to the Internet’s “root zone files”, which are administered by the Internet Corporation for Assigned Names and Numbers, or Icann, a nonprofit organization based in Marina del Rey, California.

As often happens, this discussion seems to confuse control over Internet naming with control over the Internet as a whole. Note the juxtaposition: the EU wants a new body to “set principles for running the Internet”; currently the U.S. controls naming via Icann.

This battle would be simpler and less intense if it were only about naming. What is really at issue is who will have the perceived legitimacy to regulate the Internet. The U.S. fears a U.N.-based regulator, as do I. Much of the international community fears and resents U.S. hegemony over the Net. (General anti-Americanism plays a role too, as in the Inquirer’s op-ed.)

The U.S. would have cleaner hands in this debate if it swore off broad regulation of the Net. It’s hard for the U.S. to argue against creating a new Internet regulator when the U.S. itself looks eager to regulate the Net. Suspicion is strong that the U.S. will regulate the Net to the advantage of its entertainment and e-commerce industries. Here’s the Register’s story:

The UN’s special adviser for internet governance, Nitin Desai, told us that the issue of control was particularly stark for developing nations, where the internet is not so much an entertainment or e-commerce medium but a vital part of the country’s infrastructure.

[Brazilian] Ambassador Porto clarified that point further: “Nowadays our voting system in Brazil is based on ICTs [information and communication technologies], our tax collection system is based on ICTs, our public health system is based on ICTs. For us, the internet is much more than entertainment, it is vital for our constituencies, for our parliament in Brazil, for our society in Brazil.” With such a vital resource, he asked, “how can one country control the Internet?”

The U.S. says flatly that it will not agree to an international governance scheme at this time.

If the U.S. doesn’t budge, and the international group tries to go ahead on its own, we might possibly see a split, where a new entity I’ll call “UNCANN” coexists with ICANN, with each of the two claiming authority over Internet naming. This won’t break the Internet, since each user will choose to pay attention to either UNCANN or ICANN. To the extent that UNCANN and ICANN assign names differently, there will be some confusion when UNCANN users talk to ICANN users. I wouldn’t expect many differences, though, so probably the creation of UNCANN wouldn’t make much difference, except in two respects. First, the choice to point one’s naming software at UNCANN or ICANN would probably take on symbolic importance, even if it made little practical difference. Second, UNCANN’s aura of legitimacy as a naming authority would make it easier for UNCANN to issue regulatory decrees that were taken seriously by the states that would ultimately have to implement them.

This last issue, of regulatory legitimacy, is the really important one. All the talk about naming is a smokescreen.

My guess is that the Geneva meeting will break up with much grumbling but no resolution of this issue. The EU and the rest of the international group won’t move ahead with its own naming authority, and the U.S. will tread more carefully in the future. That’s the best outcome we can hope for in the short term.

In the longer term, this issue will have to be resolved somehow. Until it is, many people around the world will keep asking the question, “Who runs the Internet?”, and not liking the answer.