November 26, 2024

Privacy, Price Discrimination, and Identification

Recently it was reported that Disney World is fingerprinting its customers. This raised obvious privacy concerns. People wondered why Disney would need that information, and what they were going to do with it.

As Eric Rescorla noted, the answer is almost surely price discrimination. Disney sells multi-day tickets at a discount. They don’t want people to buy (say) a ten-day ticket, use it for two days, and then resell the ticket to somebody else. Disney makes about $200 more by selling five separate two-day tickets than by selling a single ten-day ticket. To stop this, they fingerprint the users of such tickets and verify that the fingerprint associated with a ticket doesn’t change from day to day.

Price discrimination often leads to privacy worries, because some price discrimination strategies rely on the ability to identify individual customers so the seller knows what price to charge them. Such privacy worries seem to be intensifying as technology advances, since it is becoming easier to keep records about individual customers, easier to get information about customers from outside sources, and easier to design and manage complex price discrimination strategies.

On the other hand, some forms of price discrimination don’t depend on identifying customers. For example, early-bird discounts at restaurants cause customers to self-select into categories based on willingness to pay (those willing to come at an inconvenient time to get a lower price vs. those not willing) without needing to identify individuals.

Disney’s type of price discrimination falls into a middle ground. They don’t need to know who you are; all they need to know is that you are the same person who used the ticket yesterday. I think it’s possible to build a fingerprint-based system that stores just enough information to verify that a newly-presented fingerprint is the same one seen before, but without storing the fingerprint itself or even information useful in reconstructing or forging it. That would let Disney get what it needs to prevent ticket resale, without compromising customers’ fingerprints.

If this is possible, why isn’t Disney doing it? I can only guess, but I can think of two reasons. First, in designing identity-based systems, people seem to gravitate to designs that try to extract a “true identity”, despite the fact that this is more privacy-compromising and is often unnecessary. Second, if Disney sees customer privacy mainly as a public-relations issue, then they don’t have much incentive to design a more privacy-protective system, when ordinary customers can’t easily tell the difference.

Researchers have been saying for years that identification technologies can be designed cleverly to minimize unneeded information flows; but this suggestion hasn’t had much effect. Perhaps bad publicity over information leaks will cause companies to be more careful.

Thee and Ay

It’s not often that you learn something about yourself from a stranger’s blog. But that’s what happened to me on Friday. I was sifting through a list of new links to this blog (thanks to Technorati), and I found an entry on a blog called Serendipity, about the way I pronounce the word “the”. It turns out that my pronunciation of “the” is inconsistent, in an interesting way. In fact, in a single eight-minute public talk, I pronounce “the” in four different ways.

(Could there possibly be a less enticing premise for a blog entry than how the blog’s author pronounces the word “the”? Well, I think the details turn out to be interesting. And it’s my blog.)

Here’s the background. The article “the” in English is pronounced in two different ways, unreduced (“thee”), and reduced (“thuh”). The standard is to use the unreduced form when the next word starts with a vowel sound (“thee elephant”), and the reduced form when the next word starts with a consonant sound (“thuh dog”).

After Mark Liberman discussed this on the Language Log, readers pointed out that George W. Bush sometimes pronounces ‘a’ as the unreduced “ay” before a consonant. Bush did this a few times in his speech nominating John Roberts to the Supreme Court. Roberts also used one “thee” and one “ay” before consonants in the ensuing Q&A session.

Then Chris Waigl remembered, somehow, that she had heard me do something similar in a recorded talk. So she dug up an eight-minute recording of me speaking at the 2002 Berkeley DRM conference, and analyzed each use of “a” and “the”. She even color-coded the transcript.

It turns out that I pronounced “the” before a consonant four different ways. Sometimes I used “thee”, sometimes I used “thuh”, sometimes I used “thee” and corrected myself to “thuh”, and sometimes I used “thuh” and corrected myself to “thee”.

Why do I do this? I have no idea. I have been listening to myself ever since I read this, and I do indeed mix reduced and unreduced “the” and “a” before consonants. I haven’t caught myself correcting one to the other, but then again I probably wouldn’t notice if I did.

And now I’m listening to every speaker I hear, to see whether they do it too. Do you?

Harry Potter and the Half-Baked Plan

Despite J.K. Rowling’s decision not to offer the new Harry Potter book in e-book format, it took less than a day for fans to scan the book and assemble an unauthorized electronic version, which is reportedly circulating on the Internet.

If Rowling thought that her decision against e-book release would prevent infringement, then she needs to learn more about Muggle technology. (It’s not certain that her e-book decision was driven by infringement worries. Kids’ books apparently sell much worse as e-books than comparable adult books do, so she might have thought there would be insufficient demand for the e-book. But really – insufficient demand for Harry Potter this week? Not likely.)

It’s a common mistake to think that digital distribution leads to infringement, so that one can prevent infringement by sticking with analog distribution. Hollywood made this argument in the broadcast flag proceeding, saying that the switch to digital broadcasting of television would make the infringement problem so much worse – and the FCC even bought it.

As Harry Potter teaches us, what enables online infringement is not digital release of the work, but digital redistribution by users. And a work can be redistributed digitally, regardless of whether it was originally released in digital or analog form. Analog books can be scanned digitally; analog audio can be recorded digitally; analog video can be camcorded digitally. The resulting digital copies can be redistributed.

(This phenomenon is sometimes called the “analog hole”, but that term is misleading because the copyability of analog information is not an exception to the normal rule but a continuation of it. Objects made of copper are subject to gravity, but we don’t call that fact the “copper hole”. We just call it gravity, and we know that all objects are subject to it. Similarly, analog information is subject to digital copying because all information is subject to digital copying.)

If anything, releasing a work a work in digital form will reduce online infringement, by giving people who want a digital copy a way to pay for it. Having analog and digital versions that offer different value propositions to customers also enables tricky pricing strategies that can capture more revenue. Copyright owners can lead the digital parade or sit on the sidelines and watch it go by; but one way or another, there is going to be a parade.

Who'll Stop the Spam-Bots?

The FTC has initiated Operation Spam Zombies, a program that asks ISPs to work harder to detect and isolate spam-bots on their customers’ computers. Randy Picker has a good discussion of this.

A bot is a malicious, long-lived software agent that sits on a computer and carries out commands at the behest of a remote badguy. (Bots are sometimes called zombies. This makes for more colorful headlines, but the cognoscenti prefer “bot”.) Bots are surprisingly common; perhaps 1% of computers on the Internet are infected by bots.

Like any successful parasite, a bot tries to limit its impact on its host. A bot that uses too many resources, or that too obviously destabilizes its host system, is more likely to be detected and eradicated by the user. So a clever bot tries to be unobtrusive.

One of the main uses of bots is for sending spam. Bot-initiated spam comes from ordinary users’ machines, with only a modest volume coming from each machine; so it is difficult to stop. Nowadays the majority of spam probably comes from bots.

Spam-bots exhibit the classic economic externality of Internet security. A bot on your machine doesn’t bother you much. It mostly harms other people, most of whom you don’t know; so you lack a sufficient incentive to find and remove bots on your system.

What the FTC hopes is that ISPs will be willing to do what users aren’t. The FTC is urging ISPs to monitor their networks for telltale spam-bot activity, and then to take action, up to and including quarantining infected machines (i.e., cutting off or reducing their network connectivity).

It would be good if ISPs did more about the spam-bot problem. But unfortunately, the same externality applies to ISPs as to users. If an ISP’s customer hosts a spam-bot, most the spam sent by the bot goes to other ISPs, so the harm from that spam-bot falls mostly on others. ISPs will have an insufficient incentive to fight bots, just as users do.

A really clever spam-bot could make this externality worse, by making sure not to direct any spam to the local ISP. That would reduce the local ISP’s incentive to stop the bot to almost zero. Indeed, it would give the ISP a disincentive to remove the bot, since removing the bot would lower costs for the ISP’s competitors, leading to tougher price competition and lower profits for the ISP.

That said, there is some hope for ISP-based steps against bot-spam. There aren’t too many big ISPs, so they may be able to agree to take steps against bot-spam. And voluntary steps may help to stave off unpleasant government regulation, which is also in the interest of the big ISPs.

There are interesting technical issues here too. If ISPs start monitoring aggressively for bots, the bots will get stealthier, kicking off an interesting arms race. But that’s a topic for another day.

What is Spyware?

Recently the Anti-Spyware Coalition released a document defining spyware and related terms. This is an impressive-sounding group, convened by CDT and including companies like HP, Microsoft, and Yahoo.

Here is their central definition:

Spyware and Other Potentially Unwanted Technologies

Technologies implemented in ways that impair users’ control over:

  • Material changes that affect their user experience, privacy, or system security
  • User of their system resources, including what programs are installed on their computers
  • Collection, use and distribution of their personal or otherwise sensitive information

These are items that users will want to be informed about, and which the user, with appropriate authority from the owner of the system, should be able to easily remove or disable.

What’s interesting about this definition is that it’s not exactly a definition – it’s a description of things that users won’t like, along with assertions about what users will want, and what users should be able to do. How is it that this impressive group could only manage an indirect, somewhat vague definition for spyware?

The answer is that spyware is a surprisingly slippery concept.

Consider a program that lurks on your computer, watching which websites you browse and showing you ads based on your browsing history. Such a program might be spyware. But if your gave your informed consent to the program’s installation and operation, then public policy shouldn’t interfere. (Note: informed consent means that the consequences of accepting the program are conveyed to you fully and accurately.) So behaviors like monitoring and ad targeting aren’t enough, by themselves, to make a program spyware.

Now consider the same program, which comes bundled with a useful program that you want for some other purpose. The two programs are offered only together, you have to agree to take them both in order to get either one, and there is no way to uninstall one without uninstalling the other too. You give your informed consent to the bundle. (Bundling can raise antitrust problems under certain conditions, but I’ll ignore that issue here.) The company offering you the useful program is selling it for a price that is paid not in dollars but in allowing the adware to run. That in itself is no reason for public policy to object.

What makes spyware objectionable is not the technology, but the fact that it is installed without informed consent. Spyware is not a particular technology. Instead, it is any technology that is delivered via particular business practices. Understanding this is the key to regulating spyware.

Sometimes the software is installed with no consent at all. Installing and running software on a user’s computer, without seeking consent or even telling the user, must be illegal under existing laws such as the Computer Fraud and Abuse Act. There is no need to change the law to deal with this kind of spyware.

Sometimes “consent” is obtained, but only by deceiving the user. What the user gets is not what he thinks he agreed to. For example, the user might be shown a false or strongly misleading description of what the software will do; or important facts, such as the impossibility of uninstalling a program, might be withheld from the user. Here the issue is deception. As I understand it, deceptive business practices are generally illegal. (If spyware practices are not illegal, we may need to expand the legal rules against business deception.) What we need from government is vigilant enforcement against companies that use deceptive business practices in the installation of their software.

That, I think, is about as far as the law should go in fighting spyware. We may get more anti-spyware laws anyway, as Congress tries to show that it is doing something about the problem. But when it comes to laws, more is not always better.

The good news is that we probably don’t need complicated new laws to fight spyware. The laws we have can do enough – or at least they can do as much as the law can hope to do.

(If you’re not running an antispyware tool on your computer, you should be. There are several good options. Spybot Search & Destroy is a good free spyware remover for Windows.)