December 22, 2024

Archives for 2008

An Illustration of Wikipedia's Vast Human Resources

The Ashley Todd incident has given us a nice illustration of the points I made on Friday about “free-riding” and Wikipedia. As Clay Shirky notes, there’s a quasi-ideological divide within Wikipedia between “deletionists” who want to tightly control the types of topics that are covered on Wikipedia and “inclusionists” who favor a more liberal policy. On Friday, the Wikipedia page on Ashley Todd became the latest front in the battle between them. You can see the argument play out here. For the record, both Shirky and I came down on the inclusionists’ side. The outcome of the debate was that the article was renamed from “Ashley Todd” to “Ashley Todd mugging hoax,” an outcome I was reasonably happy with.

Notice how the Wikipedia process reverses the normal editorial process. If Brittanica were considering an article on Ashley Todd, some Brittanica editor would first perform a cost-benefit analysis to decide whether the article would be interesting enough to readers to justify the the cost of creating the article. If she thought it was, then she would commission someone to write it, and pay the writer for his work. Once the article was written, she would almost always include the article in the encyclopedia, because she had paid good money for it.

In contrast, the Wikipedia process is that some people go ahead and create an article and then there is frequently an argument about whether the article should be kept. The cost of creating the article is so trivial, relative to Wikipedia’s ample resources of human time and attention, that it’s not even mentioned in the debate over whether to keep the article.

To get a sense for the magnitude of this, consider that in less than 24 hours, dozens of Wikipedians generated a combined total of about 5000 words of arguments for and against deleting an article that is itself only about 319 words. The effort people (including me) spent arguing about whether to have the article dwarfed the effort required to create the article in the first place.

Not only does Wikipedia have no difficulties overcoming a “free rider” problem, but the site actually has so many contributors that it can afford to squander vast amounts of human time and attention debating whether to toss out work that has already been done but may not meet the community’s standards.

The Trouble with "Free Riding"

This week, one of my favorite podcasts, EconTalk, features one of my favorite Internet visionaries, Clay Shirky. I interviewed Shirky when his book came out back in April. The host, Russ Roberts, covered some of the same ground, but also explored some different topics, so it was an enjoyable listen.

I was struck by something Prof. Roberts said about 50 minutes into the podcast:

One of the things that fascinates me about [Wikipedia] is that I think if you’d asked an economist in 1950, 1960, 1970, 1980, 1990, even 2000: “could Wikipedia work,” most of them would say no. They’d say “well it can’t work, you see, because you get so little glory from this. There’s no profit. Everyone’s gonna free ride. They’d love to read Wikipedia if it existed, but no one’s going to create it because there’s a free-riding problem.” And those folks were wrong. They misunderstood the pure pleasure that overcomes some of that free-rider problem.

He’s right, but I would make a stronger point: the very notion of a “free-rider problem” is nonsensical when we’re talking about a project like Wikipedia. When Roberts says that Wikipedia solves “some of” the free-rider problem, he seems to be conceding that there’s some kind of “free rider problem” that needs to be overcome. I think even that is conceding too much. In fact, talking about “free riding” as a problem the Wikipedia community needs to solve doesn’t make any sense. The overwhelming majority of Wikipedia users “free ride,” and far from being a drag on Wikipedia’s growth, this large audience acts as a powerful motivator for continued contribution to the site. People like to contribute to an encyclopedia with a large readership; indeed, the enormous number of “free-riders”—a.k.a. users—is one of the most appealing things about being a Wikipedia editor.

This is more than a semantic point. Unfortunately, the “free riding” frame is one of the most common ways people discuss the economics of online content creation, and I think it has been an obstacle to clear thinking.

The idea of “free riding” is based on a couple of key 20th-century assumptions that just don’t apply to the online world. The first assumption is that the production of content is a net cost that must either be borne by the producer or compensated by consumers. This is obviously true for some categories of content—no one has yet figured out how to peer-produce Hollywood-quality motion pictures, for example—but it’s far from universal. Moreover, the real world abounds in counterexamples. No one loses sleep over the fact that people “free ride” off of watching company softball games, community orchestras, or amateur poetry readings. To the contrary, it’s understood that the vast majority of musicians, poets, and athletes find these activities intrinsically enjoyable, and they’re grateful to have an audience “free ride” off of their effort.

The same principle applies to Wikipedia. Participating in Wikipedia is a net positive experience for both readers and editors. We don’t need to “solve” the free rider problem because there are more than enough people out there for whom the act of contributing is its own reward.

The second problem with the “free riding” frame is that it fails to appreciate that the sheer scale of the Internet changes the nature of collective action problems. With a traditional meatspace institution like a church, business or intramural sports league, it’s essential that most participants “give back” in order for the collective effort to succeed. The concept of “free riding” emphasizes the fact that traditional offline institutions expect and require reciprocation from the majority of their members for their continued existence. A church in which only, say, one percent of members contributed financially wouldn’t last long. Neither would an airline in which only one percent of the customers paid for their tickets.

On Wikipedia—and a lot of other online content-creation efforts—the ratio of contributors to users just doesn’t matter. Because the marginal cost of copying and distributing content is very close to zero, institutions can get along just fine with arbitrarily high “free riding” rates. All that matters is whether the absolute number of contributors is adequate. And because some fraction of new users will always become contributors, an influx of additional “free riders” is almost always a good thing.

Talking about peer production as solving a “free-rider problem,” then, gets things completely backwards. The biggest danger collaborative online projects face is not “free riding” but obscurity. A tiny free software project in which every user contributes code is in a much worse position than a massively popular software project like Firefox in which 99.9 percent of users “free ride.” Obviously, every project would like to have more of its users become contributors. But the far more important objective for an online collaborative effort to is grow the total size of the user community. New “free riders” are better than nothing.

I think this misplaced focus on free-riding relates to the Robert Laughlin talk I discussed on Wednesday. I suspect that one of the reasons Laughlin is dismissive of business models that involved giving away software is because he’s used to traditional business models in which the marginal customer always imposes non-trivial costs. Companies that sell products made out of atoms would obviously go bankrupt if they tried to give away an unlimited number of their products. ” We’ve never before had goods that could be replicated infinitely and distributed at close to zero cost, and so it’s not surprising that our intuitions and our economic models have trouble dealing with them. But they’re not going away, so we’re going to have to adjust our models accordingly. Dispensing with the concept of “free riding” is a good place to start.

In closing, let me recommend Mark Lemley’s excellent paper on the economics of free riding as it applies to patent and copyright debates. He argues persuasively that eliminating “free riding” is not only undesirable, but that it’s ultimately not even a coherent objective.

Abandoning the Envelope Analogy (What Your Mailman Knows Part 2)

Last time, I commented on NPR’s story about a mail carrier named Andrea in Seattle who can tell us something about the economic downturn by revealing private facts about the people she serves on her mail route. By critiquing the decision to run the story, I drew a few lessons about the way people value and weigh privacy. In Part 2 of this series, I want to tie this to NebuAd and Phorm.

It’s probably a sign of the deep level of monomania to which I’ve descended that as I listened to the story, I immediately started drawing connections between Andrea and NebuAd/Phorm. Technology policy almost always boils down to a battle over analogies, and many in the ISP surveillance/deep packet inspection debate embrace the so-called envelope analogy. (See, e.g., the comments of David Reed to Congress about DPI, and see the FCC’s Comcast/BitTorrent order.) Just as mail carriers are prohibited from opening closed envelopes, so a typical argument goes, so too should packet carriers be prohibited from looking “inside” the packets they deliver.

As I explain in my article, I’m not a fan of the envelope analogy. The NPR story gives me one more reason to dislike it: envelopes–the physical kind–don’t mark as clear a line of privacy as we may have thought. Although Andrea is restricted by law from peeking inside envelopes, every day her mail route is awash in “metadata” that reveal much more than the mere words scribbled on the envelopes themselves. By analyzing all of this metadata, Andrea has many ways of inferring what is inside the envelopes she delivers, and she feels pretty confident about her guesses.

There are metadata gleaned from the envelopes themselves: certified letters usually mean bad economic news; utility bills turn from white to yellow to red as a person slides toward insolvency. She also engages in traffic analysis–fewer credit card offers might herald the credit crunch. She picks up cues from the surroundings, too: more names on a mailbox might mean that a young man who can no longer make rent has moved in with grandma. Perhaps most importantly, she interacts with the human recipients of these envelopes, reporting in the story about a guy who runs a cafe who jokes about needing credit card offers in order to pay the bill, or describing the people who watch her approach with “a real desperation in their eyes; when they see me their face falls; what am I going to bring today?”

So let’s stop using the envelope analogy, because it makes a comparison that doesn’t really fit well. But I have a deeper objection to the use of the envelope analogy in the DPI/ISP surveillance debate: It states a problem rather than proposes a solution, and it assumes away all of the hard questions. Saying that there is an “inside” and an “outside” to a packet is the same thing as saying that we need to draw a line between permissible and impermissible scrutiny, but it offers no guidance about how or where to draw that line. The promise of the envelope analogy is that it is clear and easy to apply, but the solutions proposed to implement the analogy are rarely so clear.

Vote flipping on the Hart InterCivic eSlate

There have been numerous press reports about “vote flipping.” I did an analysis of the eSlate, my local voting machine, including mocked up screen shots, to attempt to explain the issue.

Robert Laughlin's Unwarranted Pessimism

Monday’s edition of the Cato Institute’s daily podcast features an interview with Robert Laughlin, a Nobel Laureate in physics who wrote a book called The Crime of Reason about the ways that national security, patent, and copyright laws are restricting scientific research and the pursuit of knowledge. While I was in DC a couple of weeks ago, I had the opportunity to attend a talk he gave on the subject.

During his career, Laughlin experienced firsthand the strict regulatory regime that was erected to limit the spread of knowledge about the atom bomb. Under the Atomic Energy Act, first enacted in 1946, knowledge about certain aspects of nuclear physics is “born classified,” and anyone who independently discovers such knowledge is prohibited from disseminating it. Laughlin’s provocative thesis is that these laws served as a template for modern regulatory regimes that limit the spread of knowledge in other areas. These regimes include regulation of chemical and biological research on national security grounds, and also (to some extent) patent and copyright rules that restrict the dissemination or use of certain ideas for the benefit of private parties.

I’m sympathetic to his thesis that we should be concerned about the dangers of laws that restrict research and free inquiry. And as an eminent scientist who has first-hand experience with such regulations, Laughlin is well-positioned to throw a spotlight on the problem. Unfortunately, his argument gets a little muddled when he turned to the subject of economics. Laughlin takes a dim view of commerce, characterizing it as a deceptive form of gambling. In his view, markets are a kind of amoral war of all against all, in which we get ahead by deceiving others and controlling the flow of information. Laughlin thinks that despite the baleful effects of strong copyright and patent laws on scientific research and free inquiry, we simply can’t do without them, because it’s impossible for anyone to turn a profit without such restrictions:

In the information age, knowledge must be dear. The information age must be a time of sequestered knowledge. Why is that? Simple: you can’t get anybody to pay for something that’s free. If it’s readily available, no one will pay you for it. Therefore, the essence of the knowledge economy is hiding knowledge and making sure that people pay for it.

This claim surprised me because there are lots of companies that have managed to turn a profit while building open technological platforms. So during the question-and-answer session, I pointed out that there seemed to be plenty of technologies—the Internet being the most prominent example—that have succeeded precisely because knowledge about them was not “sequestered.”

Laughlin responded that:

If you want someone to buy software from you, you have to make it secret because otherwise they’ll go on the Internet and get it for free. You say that there are successful models of software that don’t work on that model. You bet there are. But to my knowledge, all of them are functionally advertising.

It may be true that most firms that build open technologies rely on advertising-based business models, but it’s not clear what that has to do with his broader thesis that profits require secrecy. Google may be “only” an advertising company, but it’s an immensely profitable one. IBM’s contributions to free software projects may be a clever marketing gimmick to sell its hardware and support services, but it seems to be an immensely successful marketing gimmick. And indeed, the 20th century saw Hollywood earn billions of dollars selling advertising alongside free television content. I don’t think Rupert Murdoch stays up at night worrying that most of his profits “only” come from advertising.

More generally, as Mike Masnick has written at some length Laughlin gets things completely backwards when he claims that an information economy is one in which information is expensive. To the contrary, the cost of information is falling rapidly and will continue to do so. What is getting more valuable are goods and services that are complementary to information—hardware devices, tools to search and organize the information, services to help people manage and make sense of the information, and so forth. A world of abundant and ubiquitous information isn’t a world in which we all starve. It’s a world in which more of us make our living making goods and services help people get more value out of the information goods they have available to them. Both Google’s search engine and IBM’s consulting services fit this model.

Finally, it’s important to keep in mind that we don’t face a binary choice between today’s excessive copyright and patent laws and no protections at all. Software has enjoyed traditional copyright protections since the 1970s, and to my knowledge those laws have not caused the kinds of problems that Laughlin talks about in his book. It is more recent changes in the law—notably the DMCA and the expansion of software patents—that have proven highly controversial. Rolling back some of these more recent changes to the law would not leave software companies bereft of legal protections for their products. Software piracy would still be illegal.

In short, I’m pleased to see Laughlin publicizing the risks of overzealous information regulations, but I think his pessimism is unwarranted. We can have a legal system that protects national security, promotes economic growth, and preserves researchers’ freedom of inquiry. Audio and video of the full Cato event is available here.