November 26, 2024

Michigan Debuts Counterproductive Do-Not-Spam List for Kids

The state of Michigan has a new registry of kids’ email addresses in the state. Parents can put their kids’ addresses on the list. It’s illegal to send to addresses on the list any email solicitations for products that kids aren’t allowed to buy (alcohol, guns, gambling, vehicles, etc.). The site has been accepting registrations since July 1, and emailers must comply starting August 1.

This is a kids’ version of the Do-Not-Email list that the Federal Trade Commission considered last year. The FTC decided, wisely, not to proceed with its list. (Disclosure: I worked with the FTC as a consultant on this issue.) What bothered the FTC (and should have bothered Michigan) about this issue is the possibility that unscrupulous emailers will use the list as a source of addresses to target with spam. In the worst case, signing up for the list could make your spam problem worse, not better.

The Michigan system doesn’t just give the list to emailers – that would be a disaster – but instead provides a service that allows emailers to upload their mailing lists to a state-run server that sends the list back after removing any registered addresses. (Emailers who are sufficiently trusted by the state can apparently get a list of hashed addresses, allowing them to scrub their own lists.)

The problem is that an emailer can compare his initial list against the scrubbed version. Any address that is on the former but not the latter must be the address of a registered kid. By this trick the emailer can build a list of kids’ email addresses. The state may outlaw this, but it seems hard to stop it from happening, especially because the state appears to require emailers everywhere in the world to scrub their lists.

If I lived in Michigan, I wouldn’t register my kid’s address.

UPDATE (July 13): A commenter points out that the Michigan program imposes a charge of $0.007 per address on emailers. I missed this fact originally, and it changes the analysis significantly. See my later post for details.

Chess Computer Crushes Elite Human Player

Last week Hydra, a chess-playing computer, completed its rout of Michael Adams, the seventh-ranked human player in the world. Hydra won five of six games, and Adams barely escaped with a draw in the other game. ChessBase has the details, including a page where you can play through the six games.

It’s time to admit that computers play better chess than people.

This may seem inevitable in hindsight, but for the longest time people insisted that human chess players had something special which computers could never duplicate. That was true, up to a point. Computers have never succeeded at approaching chess the way people do. The best human players make subtle, intuitive judgments that are probably based on pattern-matching deep in their neural circuitry. Often an elite player cannot verbalize how he knows that one configuration of pieces is dangerous when another nearly identical configuration is not. He just knows. He does calculate in the “if he does this, I’ll do that, then he’ll do this, …” fashion, but only when necessary.

Every attempt to transplant human “intelligence” into a chess computer has failed miserably. Computers understand very little about chess. They rely instead on rudimentary judgment about chess positions, coupled with prodigious calculation, looking ahead at hundred of millions or billions of possible board positions.

Chess players classify game situations into two categories, “tactical” and “positional”. Tactical situations feature direct, violent clashes between pieces, and call mostly for calculation, with intuition as a backstop. Positional situations are slow and subtle, requiring deep judgments and long maneuvers. Everybody expected computers to excel at tactics. The big surprise is that the computer approach seems to work well in positional situations too. Somehow, calculation can substitute for judgment, even when conditions seem to require judgment.

This is not to say that it’s easy to create a chess computer that plays as well as Hydra. Quite the contrary. Great effort has been spent on perfecting computer chess algorithms. That effort has gone not to teaching computers about chess, but to improving the algorithms for deciding when to cut off calculations and when to calculate more deeply. Indeed, algorithmic improvements have been a much bigger factor even than Moore’s Law over the years.

Chess computers have succeeded by ignoring what human chessplayers do best, and doing instead what computers do best. And what computers do best is to run programs written by very clever human programmers.

Posner and Becker, Law and Economics

Richard Posner and Gary Becker turn their bloggic attention to the Grokster decision this week. Posner returns to the argument of his Aimster opinion. Becker is more cautious.

After reiterating the economic arguments for and against indirect liability, Posner concludes:

There is a possible middle way that should be considered, and that is to provide a safe harbor to potential contributory infringers who take all reasonable (cost-justified) measures to prevent the use of their product or service by infringers. The measures might be joint with the copyright owners. For example, copyright owners who wanted to be able to sue for contributory infringement might be required, as a condition of being permitted to sue, to place a nonremovable electronic tag on their CDs that a computer would read, identifying the CD or a file downloaded from it as containing copyrighted material. Software producers would be excused from liability for contributory infringement if they designed their software to prevent the copying of a tagged file. This seems a preferable approach to using the judicial system to make a case by case assessment of whether to impose liability for contributory infringement on Grokster-like enterprises.

It’s fascinating that Judge Posner, with his vast knowledge about the law and about economics, avoids a case-by-case law and economics approach and looks instead for a technical deus ex machina. Unfortunately, his knowledge of technology is shakier, and he endorses a technical approach that is already discredited. Nobody knows how to create the indelible marks he asks for, and in any case the system he suggests is easily defeated by encrypting or compressing the content – not to mention the problems with malicious placement of marks. In short, this approach is a non-starter.

Becker is right on the mark here:

But several things concern me about the issues raised by this and related court decisions. I basically do not trust the ability of judges, even those with the best of intentions and competence, to decide the economic future of an industry. Do we really want the courts determining when the fraction of the total value due to legal sales is high enough to exonerate manufacturers from contributory infringement? Neither the wisest courts nor wisest economists have enough knowledge to make that decision in a way that is likely to produce more benefits than harm. Does the fraction of legitimate value have to be higher than 50 per cent, 75 per cent, 10 per cent, or some other number? Courts should consider past trends in these percentages because new uses for say a software-legal or illegal- inevitably emerge over time as users become more familiar with its potential. Must courts have to speculate about future uses of software or other products, speculation likely to be dominated by dreams and hopes rather than firm knowledge?

One of the tenets of the law and economics movement is that decisions about legal regulation of economic behavior should be grounded in a deep understanding of economics. Sound economics can predict the effect of proposed legal rules; but bad economics leads to bad law. As luminaries of the law and economics movement, Posner and Becker understand this as well as anyone.

What is true of economics is equally true of computer science. Only by understanding computer science can we predict the impact of proposed regulations of technology. As we have seen so many times, bad computer science leads to bad law. Posner seems to miss this, but Becker’s stance shows appropriate caution.

One criticism of law and economics is that it works well in a seminar room but may lead to dangerous overconfidence if applied to a hard case by an overworked, generalist judge. One solution is to teach judges more economics, and economic seminars for judges have proliferated. Perhaps the time has come to run seminars in computer science for judges.

Book Club Discussion: Code, Chapter 5

This week in Book Club we read Chapter 5 of Lawrence Lessig’s Code, and Other Laws of Cyberspace.

Let’s discuss the chapter in the comments area below.

For next week, we’ll read Chapter 6.

GAO Data: Porn Rare on P2P; Filters Ineffective

P2P nets have fewer pornographic images than the Web, and P2P porn filters are ineffective, according to data in a new report from the U.S. Government Accountability Office (GAO).

Mind you, the report’s summary text says pretty much the opposite, but where I come from, data gets more credibility than spin. The data can be found on pages 58-69 of the report. (My PDF reader calls those pages 61-72. To add to the confusion, the pages include images of PowerPoint slides bearing the numbers 53-64.)

The researchers did searches for images, using six search terms (three known to be associated with porn and three innocuous ones) on three P2P systems (Warez, Kazaa, Morpheus) and three search engines (Google, MSN, Yahoo). They looked at the resulting images and classified each image as adult porn, child porn, cartoon porn, adult erotica, cartoon erotica, or other. For brevity, I’ll lump together all of the porn and erotica categories into a meta-category that I’ll call “porne”, so that there are two categories, porne and non-porne.

The first observation from the data is that P2P nets have relatively few porne images, compared to the Web. The eighteen P2P searches found a total of 277 porne images. The eighteen Web searches found at least 655 porne images. But they had to cut off the analysis after the first 100 images of each Web search, because the Web searches returned so many images, so the actual number of Web porne images might have been much larger. (No such truncation was necessary on the P2P searches.)

The obvious conclusion is that if you want to regulate communications technology to keep porne away from kids, you should start with the Web, because it’s a much bigger danger than P2P.

The report also looked at the effectiveness of the porn blocking facilities built into some of the products. The data show pretty clearly that the filters are ineffective at distinguishing porne from non-porne images.

Two of the P2P systems, Kazaa and Morpheus, have built-in porn blocking. The report did the same searches, with and without blocking enabled, and compared the results. They report the data in an odd format, but I have reorganized their data into a more enlightening form. First, let’s look at the results for the three search terms “known to be associated with pornography”. For each term, I’ll report two figures of merit: what percentage of the porne images was blocked by the filter, and what percentage of the non-porne images was (erroneously) blocked by the filter. Here are the results:

Product % Porne Blocked % Non-porne Blocked
Kazaa 100% 100%
Morpheus 83% 69%

Kazaa blocks all of the porne, by the clever expedient of blocking absolutely everything it sees. For non-porne images, Kazaa has a 100% error rate. Morpheus does only slightly better, blocking 83% of the porne, while erroneously blocking “only” 69% of the non-porne. In all, it’s a pretty poor performance.

Here are the results for searches on innocuous search terms (ignoring one term which never yielded any porne):

Product % Porne Blocked % Non-porne Blocked
Kazaa 100% -9%
Morpheus -150% 0%

You may be wondering where the negative percentages come from. According to the report, more images are found with the filters turned on when they are turned off. If the raw data are to be believed, turning on the Morpheus filter more than doubles the amount of porne you can find! There’s obviously something wrong with the data, and it appears to be that searches were done at different times, when very different sets of files were available. This is pretty sloppy experimental technique – enough to cast doubt on the whole report. (One expects better from the GAO.)

But we can salvage some value from this experiment if we assume that even though the total number of files on the P2P net changed from one measurement to the next, the fraction of files that were porne stayed about the same. (If this is not true, then we can’t really trust any of the experiments in the report.) Making this assumption, we can then calculate the percentage of available files that are porne, both with and without blocking.

Product % Porne, without Filter % Porne, with Filter
Kazaa 27% 0%
Morpheus 20% 38%

The Kazaa filter successfully blocks all of the porne, but we don’t know how much of the non-porne it erroneously blocks. The Morpheus filter does a terrible job, actually making things worse. You could do better by just flipping a coin to decide whether to block each image.

So here’s the bottom line on P2P porne filters: you can have a filter that massively overblocks innocuous images, or you can have a filter that sometimes makes things worse and can’t reliably beat a coin flip. Or you can face the fact that these filters don’t help.

(The report also looked at the effectiveness of the built-in porn filters in Web search engines, but due to methodological problems those experiments don’t tell us much.)

The policy prescription here is clear. Don’t mandate the use of filters, because they don’t seem to work. And if you want filters to improve, it might be a good idea to fully legalize research on filtering systems, so people like Seth Finkelstein can finish the job the GAO started.