May 3, 2024

Judge Strikes Down COPA

Last week a Federal judge struck down COPA, a law requiring adult websites to use age verification technology. The ruling by Senior Judge Lowell A. Reed Jr. held COPA unconstitutional because it is more restrictive of speech (but no more effective) than the alternative of allowing private parties to use filtering software.

This is the end of a long legal process that started with the passage of COPA in 1999. The ACLU, along with various authors and publishers, immediately filed suit challenging COPA, and Judge Reed struck down the law. The case was appealed up to the Supreme Court, which generally supported Judge Reed’s ruling but remanded the case back to him for further proceedings because enough time had passed that the technological facts might have changed. Judge Reed held another trial last fall, at which I testified. Now he has ruled, again, that COPA is unconstitutional.

The policy issue behind COPA is how to keep kids from seeing harmful-to-minors (HTM) material. Some speech is legally obscene, which means it is so icky that it does not qualify for First Amendment free speech protection. HTM material is not obscene – adults have a legally protected right to read it – but is icky enough that kids don’t have a right to see it. In other words, there is a First Amendment right to transmit HTM material to adults but not to kids.

Congress has tried more than once to pass laws keeping kids away from HTM material online. The first attempt, the Communications Decency Act (CDA), was struck down by the Supreme Court in 1997. When Congress responded by passing COPA in 1999, it used the Court’s CDA ruling as a roadmap in writing the new law, in the hope that doing so would make COPA consistent with free speech.

Unlike the previous CDA ruling, Judge Reed’s new COPA ruling doesn’t seem to give Congress a roadmap for creating a new statute that would pass constitutional muster. COPA required sites publishing HTM material to use age screening technology to try to keep kids out. The judge compared COPA’s approach to an alternative in which individual computer owners had the option of using content filtering software. He found that COPA’s approach was more restrictive of protected speech and less effective in keeping kids away from HTM material. That was enough to make COPA, as a content-based restriction on speech, unconstitutional.

Two things make the judge’s ruling relatively roadmap-free. First, it is based heavily on factual findings that Congress cannot change – things like the relative effectiveness of filtering and the amount of HTM material that originates overseas beyond the effective reach of U.S. law. (Filtering operates on all material, while COPA’s requirements could have been ignored by many overseas sites.) Second, the alternative it offers requires only voluntary private action, not legislation.

Congress has already passed laws requiring schools and libraries to use content filters, as a condition of getting Federal funding and with certain safeguards that are supposed to protect adult access. The courts have upheld such laws. It’s not clear what more Congress can do. Judge Reed’s filtering alternative is less restrictive because it is voluntary, so that computers that aren’t used by kids, or on which parents have other ways of protecting kids against HTM material, can get unfiltered access. An adult who wants to get HTM material will be able to get it.

Doubtless Congress will make noise about this issue in the upcoming election year. Protecting kids from the nasty Internet is too attractive politically to pass up. Expect hearings to be held and bills to be introduced; but the odds that we’ll get a new law that makes much difference seem pretty low.

Fact check: The New Yorker versus Wikipedia

In July—when The New Yorker ran a long and relatively positive piece about Wikipedia—I argued that the old-media method of laboriously checking each fact was superior to the wiki model, where assertions have to be judged based on their plausibility. I claimed that personal experience as a journalist gave me special insight into such matters, and concluded: “the expensive, arguably old fashioned approach of The New Yorker and other magazines still delivers a level of quality I haven’t found, and do not expect to find, in the world of community-created content.”

Apparently, I was wrong. It turns out that EssJay, one of the Wikipedia users described in The New Yorker article, is not the “tenured professor of religion at a private university” that he claimed he was, and that The New Yorker reported him to be. He’s actually a 24-year-old, sans doctorate, named Ryan Jordan.

Jimmy Wales, who is as close to being in charge of Wikipedia as anybody is, has had an intricate progression of thought on the matter, ably chronicled by Seth Finklestein. His ultimate reaction (or at any rate, his current public stance as of this writing) is on his personal page in Wikipedia

I only learned this morning that EssJay used his false credentials in content disputes… I understood this to be primarily the matter of a pseudonymous identity (something very mild and completely understandable given the personal dangers possible on the Internet) and not a matter of violation of people’s trust.

As Seth points out, this is an odd reaction since it seems simultaneously to forgive EssJay for lying to The New Yorker (“something very mild”) and to hold him much more strongly to account for lying to other Wikipedia users. One could argue that lying to The New Yorker—and by extension to its hundreds of thousands of subscribers—was in the aggregate much worse than lying to the Wikipedians. One could also argue that Mr. Jordan’s appeal to institutional authority, which was as successful as it was dishonest, raises profound questions about the Wikipedia model.

But I won’t make either of those arguments. Instead, I’ll return to the issue that has me putting my foot in my mouth: How can a reader decide what to trust? I predicted you could trust The New Yorker, and as it turns out, you couldn’t.

Philip Tetlock, a long-time student of the human penchant for making predictions, has found (in a book whose text I can’t link to, but which I encourage you to read) that people whose predictions are falsified typically react by making excuses. They typically claim that they are off the hook because the conditions based on which they predicted a certain result were actually not as they seemed at the time of the inaccurate prediction. This defense is available to me: The New Yorker fell short of its own standards, and took EssJay at his word without verifying his identity or even learning his name. He had, as all con men do, a plausible-sounding story, related in this case to a putative fear of professional retribution that in hindsight sits rather uneasily with his claim that he had tenure. If the magazine hadn’t broken its own rules, this wouldn’t have gotten into print.

But that response would be too facile, as Tetlock rightly observes of the general case. Granted that perfect fact checking makes for a trustworthy story; how do you know when the fact checking is perfect and when it is not? You don’t. More generally, predictions are only as good as someone’s ability to figure out whether or not the conditions are right to trigger the predicted outcome.

So what about this case: On the one hand, incidents like this are rare and tend to lead the fact checkers to redouble their meticulousness. On the other, the fact claims in a story that are hardest to check are often for the same reason the likeliest ones to be false. Should you trust the sometimes-imperfect fact checking that actually goes on?

My answer is yes. In the wake of this episode The New Yorker looks very bad (and Wikipedia only moderately so) because people regard an error in The New Yorker to be exceptional in a way the exact same error in Wikipedia is not. This expectations gap tells me that The New Yorker, warts and all, still gives people something they cannot find at Wikipedia: a greater, though conspicuously not total, degree of confidence in what they read.

Google Print, Damages and Incentives

There’s been lots of discussion online of this week’s lawsuit filed against Google by a group of authors, over the Google Print project. Google Print is scanning in books from four large libraries, indexing the books’ contents, and letting people do Google-style searches on the books’ contents. Search results show short snippets from the books, but won’t let users extract long portions. Google will withdraw any book from the program at the request of the copyright holder. As I understand it, scanning was already underway when the suit was filed.

The authors claim that scanning the books violates their copyright. Google claims the project is fair use. Everybody agrees that Google Print is a cool project that will benefit the public – but it might be illegal anyway.

Expert commentators disagree about the merits of the case. Jonathan Band thinks Google should win. William Patry thinks the authors should win. Who am I to argue with either of them? The bottom line is that nobody knows what will happen.

So Google was taking a risk by starting the project. The risk is larger than you might think, because if Google loses, it won’t just have to reimburse the authors for the economic harm they have suffered. Instead, Google will have to pay statutory damages of up to $30,000 for every book that has been scanned. That adds up quickly! (I don’t know how many books Google has scanned so far, but I assume it’s a nontrivial numer.)

You might wonder why copyright law imposes such a high penalty for an act – scanning one book – that causes relatively little harm. It’s a good question. If Google loses, it makes economic sense to make Google pay for the harm it has caused (and to impose an injunction against future scanning). This gives Google the right incentive, to weigh the expected cost of harm to the authors against the project’s overall value.

Imposing statutory damages makes technologists like Google too cautious. Even if a new technology creates great value while doing little harm, and the technologist has a strong (but not slam-dunk) fair use case, the risk of statutory damages may deter the technology’s release. That’s inefficient.

Some iffy technologies should be deterred, if they create relatively little value for the harm they do, or if the technologist has a weak fair use case. But statutory damages deter too many new technologies.

[Law and economics mavens may object that under some conditions it is efficient to impose higher damages. That’s true, but I don’t think those conditions apply here. I don’t have space to address this point further, but please feel free to discuss it in the comments.]

In light of the risk Google is facing, it’s surprising that Google went ahead with the project. Maybe Google will decide now that discretion is the better part of valor, and will settle the case, stopping Google Print in exchange for the withdrawal of the lawsuit.

The good news, in the long run at least, is that this case will remind policymakers of the value of a robust fair use privilege.

Secrecy in Science

There’s an interesting dispute between astronomers about who deserves credit for discovering a solar system object called 2003EL61. Its existence was first announced by Spanish astronomers, but another team in the U.S. believes that the Spaniards may have learned about the object due to an information leak from the U.S. team.

The U.S. team’s account appears on their web page and was in yesterday’s NY Times. The short version is that the U.S. team published an advance abstract about their paper, which called the object by a temporary name that encoded the date it had been discovered. They later realized that an obscure website contained a full activity log for the telescope they had used, which allowed anybody with a web browser to learn exactly where the telescope had been pointing on the date of the discovery. This in turn allowed the object’s orbit to be calculated, enabling anybody to point their telescope at the object and “discover” it. Just after the abstract was released, the Spanish team apparently visited the telescope log website; and a few days later the Spanish team announced that they had discovered the object.

If this account is true, it’s clearly a breach of scientific ethics by the Spaniards. The seriousness of the breach depends on other circumstances which we don’t know, such as the possibility that the Spaniards had already discovered the object independently and were merely checking whether the Americans’ object was the same one. (If so, their announcement should have said that the American team had discovered the object independently.)

[UPDATE (Sept. 15): The Spanish team has now released their version of the story. They say they discovered the object on their own. When the U.S. group’s abstract, containing a name for the object, appeared on the Net, the Spaniards did a Google search for the object name. The search showed a bunch of sky coordinates. They tried to figure out whether any of those coordinates corresponded to the object they had seen, but they were unable to tell one way or the other. So they went ahead with their own announcement as planned.

This is not inconsistent with the U.S. team’s story, so it seems most likely to me that both stories are true. If so, then I was too hasty in inferring a breach of ethics, for which I apologize. I should have realized that the Spanish team might have been unable to tell whether the objects were the same.]

When this happened, the American team hastily went public with another discovery, of an object called 2003UB313 which may be the tenth planet in our solar system. This raised the obvious question of why the team had withheld the announcement of this new object for as long as they did. The team’s website has an impassioned defense of the delay:

Good science is a careful and deliberate process. The time from discovery to announcement in a scientific paper can be a couple of years. For all of our past discoveries, we have described the objects in scientific papers before publicly announcing the objects’ existence, and we have made that announcement in under nine months…. Our intent in all cases is to go from discovery to announcement in under nine months. We think that is a pretty fast pace.

One could object to the above by noting that the existence of these objects is never in doubt, so why not just announce the existence immediately upon discovery and continue observing to learn more? This way other astronomers could also study the new object. There are two reasons we don’t do this. First, we have dedicated a substantial part of our careers to this survey precisely so that we can discover and have the first crack at studying the large objects in the outer solar system. The discovery itself contains little of scientific interest. Almost all of the science that we are interested in doing comes from studying the object in detail after discovery. Announcing the existence of the objects and letting other astronomers get the first detailed observations of these objects would ruin the entire scientific point of spending so much effort on our survey. Some have argued that doing things this way “harms science” by not letting others make observations of the objects that we find. It is difficult to understand how a nine month delay in studying an object that no one would even know existed otherwise is in any way harmful to science!

Many other types of astronomical surveys are done for precisely the same reasons. Astronomers survey the skies looking for ever higher redshift galaxies. When they find them they study them and write a scientific paper. When the paper comes out other astronomers learn of the distant galaxy and they too study it. Other astronomers cull large databases such as the 2MASS infrared survey to find rare objects like brown dwarves. When they find them they study them and write a scientific paper. When the paper comes out other astronomers learn of the brown dwarves and they study them in perhaps different ways. Still other astronomers look around nearby stars for the elusive signs of directly detectable extrasolar planets. When they find one they study it and write a scientific paper….. You get the point. This is the way that the entire field of astronomy – and probably all of science – works. It’s a very effective system; people who put in the tremendous effort to find these rare objects are rewarded with getting to be the first to study them scientifically. Astronomers who are unwilling or unable to put in the effort to search for the objects still get to study them after a small delay.

This describes an interesting dynamic that seems to occur in all scientific fields – I have seen it plenty of times in computer science – where researchers withhold results from their colleagues for a while, to ensure that they get a headstart on the followup research. That’s basically what happens when an astronomer delays announcing the discovery of an object, in order to do followup analyses of the object for publication.

The argument against this secrecy is pretty simple: announcing the first result would let more people do followup work, making the followup work both quicker and more complete on average. Scientific discovery would benefit.

The argument for this kind of secrecy is more subtle. The amount of credit one gets for a scientific result doesn’t always correlate with the difficulty of getting the result. If a result is difficult to get but doesn’t create much credit to the discoverer, then there is an insufficient incentive to look for that result. The incentive is boosted if the discoverer gets an advantage in doing followup work, for example by keeping the original result secret for a while. So secrecy may increase the incentive to do certain kinds of research.

Note that there isn’t much incentive to keep low-effort / high-credit research secret, because there are probably plenty of competing scientists who are racing to do such work and announce it first. The incentive to keep secrets is biggest for high-effort / low-credit research which enables low-effort / high-credit followup work. And this is exactly the case where incentives most need to be boosted.

Michael Madison compares the astronomers’ tradeoff between publication and secrecy to the tradeoff an inventor faces between keeping an invention secret, and filing for a patent. As a matter of law, discovered scientific facts are not patentable, and that’s a good thing.

As Madison notes, science does have its own sort of “intellectual property” system that tries to align incentives for the public good. There is a general incentive to publish results for the public good – scientific credit goes to those who publish. Secrecy is sometimes accepted in cases where secret-keeping is needed to boost incentives, but the system is designed to limit this secrecy to cases where it is really needed.

But this system isn’t perfect. As the astronomers note, the price of secrecy is that followup work by others is delayed. Sometimes the delay isn’t too serious – 2003UB313 will still be plodding along in its orbit and there will be plenty of time to study it later. But sometimes delay is a bigger deal, as when an astronomical object is short-lived and cannot be studied at all later. Another example, which arises more often in computer security, is when the discovery is about an ongoing risk to the public which can be mitigated more quickly if it is more widely known. Scientific ethics tend to require at least partial publication in cases like these.

What’s most notable about the scientific system is that it works pretty well, at least within the subject matter of science, and it does so without much involvement by laws or lawyers.

Harry Potter and the Half-Baked Plan

Despite J.K. Rowling’s decision not to offer the new Harry Potter book in e-book format, it took less than a day for fans to scan the book and assemble an unauthorized electronic version, which is reportedly circulating on the Internet.

If Rowling thought that her decision against e-book release would prevent infringement, then she needs to learn more about Muggle technology. (It’s not certain that her e-book decision was driven by infringement worries. Kids’ books apparently sell much worse as e-books than comparable adult books do, so she might have thought there would be insufficient demand for the e-book. But really – insufficient demand for Harry Potter this week? Not likely.)

It’s a common mistake to think that digital distribution leads to infringement, so that one can prevent infringement by sticking with analog distribution. Hollywood made this argument in the broadcast flag proceeding, saying that the switch to digital broadcasting of television would make the infringement problem so much worse – and the FCC even bought it.

As Harry Potter teaches us, what enables online infringement is not digital release of the work, but digital redistribution by users. And a work can be redistributed digitally, regardless of whether it was originally released in digital or analog form. Analog books can be scanned digitally; analog audio can be recorded digitally; analog video can be camcorded digitally. The resulting digital copies can be redistributed.

(This phenomenon is sometimes called the “analog hole”, but that term is misleading because the copyability of analog information is not an exception to the normal rule but a continuation of it. Objects made of copper are subject to gravity, but we don’t call that fact the “copper hole”. We just call it gravity, and we know that all objects are subject to it. Similarly, analog information is subject to digital copying because all information is subject to digital copying.)

If anything, releasing a work a work in digital form will reduce online infringement, by giving people who want a digital copy a way to pay for it. Having analog and digital versions that offer different value propositions to customers also enables tricky pricing strategies that can capture more revenue. Copyright owners can lead the digital parade or sit on the sidelines and watch it go by; but one way or another, there is going to be a parade.