November 28, 2024

A Grand Unified Theory of Filesharing

Recently we’ve seen several studies of the impact of filesharing on CD sales. We have enough data now to draw some (very) preliminary conclusions, assuming the studies are correct. Despite the apparent contradictions between the various studies, I think there is a plausible theory that can explain them all – a Grand Unified Theory of Filesharing.

First, let’s review the three main results that have to be explained.

  • Survey-based studies, which ask people whether they use the Internet, whether (and how much) they use filesharing, and how many CDs they buy, find that people who fileshare buy fewer CDs.
  • The recent econometric study by Oberholzer and Strumpf, based on per-album time-series data on filesharing activity, CD sales, and other factors, found that filesharing has little or no effect on CD sales.
  • Eric Boorstin’s study found, controlling for differences in personal income, that there is a strong positive correlation between Internet usage and CD purchasing. This held true for all age groups, except the 15-24 group, for whom Internet usage correlates negatively with CD purchasing.

(It’s undisputed that CD sales have dropped sharply in recent years, but there are several plausible causes for that drop. That’s a topic for another day. Here, I’ll assume only that filesharing is not the only cause of the sales drop, so that we don’t need filesharing to explain the drop.)

The Grand Unified Theory explains the study results by breaking down the users of filesharing into two subpopulations, which I will call Free-riders and Samplers.

Free-riders are generally young. They have few if any moral qualms about filesharing, and they tend to assume that others feel the same way. They use filesharing to accumulate libraries of music, as an alternative to buying CDs.

Samplers are generally older and more risk-averse. They are highly engaged with cultural products of all sorts. They are morally conflicted about filesharing, and use it mostly to download songs that either aren’t for sale, or that they don’t value enough to pay for. They buy music that they really like, and filesharing causes them to find more music they like, so it tends to increase their CD purchases.

Now let’s look at how the theory explains the studies’ results.

In survey-based studies, Free-riders admit to filesharing and to buying fewer CDs because of their filesharing. But Samplers are reluctant to confess their filesharing to a stranger, being more risk-averse and more attuned to the dubious moral status of filesharing (not to mention its illegality). The result is that Free-riders are overcounted in survey-based studies, and Samplers are undercounted, so survey-based studies find that filesharing depresses CD sales.

The Oberholzer and Strumpf study measured the actual impact of both Free-riders and Samplers, and found that the lost sales caused by Free-riders are balanced by the increased sales due to Samplers.

The Boorstin study had different results for different age groups. His 15-24 age group was mostly Free-riders, who buy fewer CDs when they have Internet access, because their filesharing substitutes for purchases. His older age groups were mostly Samplers, who buy more CDs because of filesharing, and who are also, because of their high level of cultural engagement, predisposed to both Internet usage and CD purchasing. Therefore he found that young Internet users buy fewer CDs, while older Internet users buy a lot more.

So there you have it: a theory that explains the study results, and that seems plausible (to me, at least). Of course, there are lots of caveats here. One or more of the studies might be wrong; or the studies might be right but the theory wrong. But bear with me for a bit longer as I explore the possible consequences of the theory.

The theory says that the net effect of filesharing on CD sales is roughly zero, because of a balance between the negative impact of the Free-riders and the positive impact of the Samplers. But what happens in the future? It all depends on what happens to today’s Free-riders.

Perhaps today’s Free-riders will mature into Samplers, to be replaced by a new generation of Free-riders, so that the effects of the two groups continue in a rough balance. Or perhaps today’s Free-riders, never having known anything else, will keep Free-riding as they get older, and the balance will tip toward Free-riders.

It’s also worth noting that the theory does not predict whether (illegal, free) filesharing will reduce online sales of music. Probably the answer depends on what the online alternatives look like, and how convenient they are to use.

So the theory can explain the present situation, but it doesn’t make strong predictions about the future; or, if you prefer, the theory comes in several flavors, which differ in their future predictions. If we had a better handle on what makes one person a Free-rider and another a Sampler, we could make better predictions.

[Thanks to Eric Boorstin and Andrew Appel for helping me develop and refine these ideas.]

New Study of the Net

Eric Boorstin, a senior at Princeton, just filed his senior thesis, Music Sales in the Age of File Sharing. The thesis includes a clever study of the impact of Internet usage on CD sales. This is a twist on previous studies, which have tried to correlate CD sales to usage of filesharing. The tradeoff here is that although Internet usage is one step removed from filesharing, the data on Internet usage are much more detailed and much more reliable than the data on filesharing usage.

Eric worked from two datasets. The first dataset came from SoundScan, and gave him aggregate sales of CDs, on a week-by-week basis, for many separate metropolitan areas in the U.S. The second dataset came from the U.S. Census Bureau, and contained data on population, income, and Internet usage, broken down by age group and geographic area. The census data came from 1998, 2000, and 2001. Combining these datasets, he ended up with data for CD sales, age group demographics, income, and Internet adoption, at three different points in time, in ninety-nine separate metropolitan areas in the U.S.

Eric took these datasets and did a regression to determine the correlation between Internet adoption rate and CD sales, broken down by age group. He controlled for differences in personal income. (For more methodological details, see the thesis.)

For people in the 15-24 age group, he found a significant negative correlation between Internet adoption and CD sales. For people in all of the age groups older than 25, he found the opposite

Princeton Proposes Quotas to Control Grade Inflation

Princeton is considering putting a cap on the number of A’s that professors could award to students, as a way of fighting grade inflation. Details are in Alyson Zureick’s story in today’s Daily Princetonian. To my knowledge, Princeton would be the first major university to take such a step. The proposal would have to be approved by a vote of the faculty before taking effect.

Grade inflation is a real problem, and it’s a hard one to fight. There are weak but steady pressures that push grades up over time. A professor, faced with a student on the borderline between two grades, finds it easier to give the higher grade; and at the end of a long semester of hard work by professor and student, it feels right to give that borderline student a tiny nudge upward. Students complain about low grades, and sometimes they can point to a grading error that justifies an upward adjustment; but rarely if ever do they complain about generous grades. These nudges and corrections slowly push the average grade upward.

I also think, notwithstanding the occasional grumbling of old-timers, that our students have gotten more capable over the years. If this is true, then grades naturally inflate, unless we grade the same work more harshly than we did in the past.

In recent years, Princeton’s strategy has been to report comparative statistics, telling each department how its grade distribution compares to others, and telling each professor how his grade distribution compares to his peers. Apparently that strategy has not been enough to stop grade inflation.

The new Princeton proposal would require each department to give no more than 35% A’s (including A+ and A-) in courses. It would be left to each department to decide how to stay under this cap.

I don’t know yet whether I’ll vote for or against this proposal. But I do know that if it passes, my department will have to set some policy for allocating our quota of A’s among our courses. Setting that policy will be no fun at all, even in a department as sane and collegial as mine.

UPDATE (10:45 AM): For more reaction, see today’s Boston Globe story by Marcella Bombardieri.

WIPO Considering a Ban on Computers

Ernest Miller points to a draft treaty being considered by the World Intellectual Property Organization. It’s a truly remarkable document. And I don’t mean that in a good way.

Here’s the most amazing part, from Article 16, Alternative V:

2. In particular, effective legal remedies shall be provided against those who:

(iii) participate in the manufacture, importation, sale, or any other act that makes available a device or system capable of decrypting or helping to decrypt an encrypted program-carrying signal.

Every computer is “capable of decrypting or helping to decrypt” such a signal, so this provision, if adopted, would apparently require signatories to the treaty to ban the importation, sale, or distribution of computers.

Note this this is just an “alternative” under consideration. It was proposed by Argentina, and Switzerland proposed language that “roughly corresponds” to it. I don’t know whether the U.S. has taken a position on this, but I assume the U.S. is still in favor of computers being legal.

Trademarks and Ad Placement

Dana Blankenhorn at Moore’s Lore has some interesting discussion of the lawsuit between American Blinds and Google.

Here’s the background: When you do a Google search, the results page gives the search results on the left side of the page, and a few ads (marked as such) on the right edge of the page. The ads are chosen based on the words in your search; advertisers buy placement for particular search words. For example, a pizza restaurant in Princeton might buy placement on searches for “princeton pizza”.

American Blinds makes window blinds. If you search for “American Blinds” on Google, you will see (or at least, you would have seen before the lawsuit) ads for some of American Blinds’ competitors. American Blinds claims that this is a trademark violation, since Google is associating competing products with the trademarked name “American Blinds”. American Blinds says that Google may not sell competing ads keyed to the trademarked name, without the trademark owner’s permission.

Most people’s initial reaction is that American Blinds’ lawsuit is ridiculous and should stand no chance of succeeding. But Google already lost a similar lawsuit in France, and it already lets Dell do what American Blinds wants to do.

To give every trademark holder veto power over the placement of clearly marked ads on search pages seems like bad policy, whatever the law says. (If the ad-laden page were trying to mislead customers about who is connected to the trademark, I would feel differently; but that’s not the case here.) Creating so many vetoes would seriously cramp the ability of Google, or anybody else, to sell keyword-triggered ads, especially given how crowded the namespace has gotten.

Consider a hypothetical hungry traveler who searches for “princeton pizza,” wanting to survey his dinner options. If there’s a restaurant called “Princeton Pizza,” and it has a veto over ad placement on that phrase, the traveler will be frustrated. And it’s hard to see what other search the traveler could do to circumvent the trademark issue and get a list of pizza places in Princeton.

Perhaps Google could provide an “I mean the words I wrote, not the trademark” option, so that a search for “princeton pizza –notrademark” would display ads triggered by the words, ignoring any trademark vetoes. But it’s hard to believe that that would satisfy American Blinds or other trademark owners.

Another attempted solution is to say that a trademark owner should get a veto if the only reason consumers would search on the name is when looking for the trademarked item, but that there should be no veto if a consumer might plausibly search for the trademarked phrase for other reasons. That’s a useful distinction in theory, but such a test seems too tricky to apply in practice.

I can’t think of a good way to accomodate the trademark owners’ legitimate interests, without essentially shutting down word-based ad placement services. And in the absence of such a solution, it seems to me better to let the ad placements go on.