November 27, 2024

Wikipedia vs. Britannica Smackdown

On Friday I wrote about my spot-check of the accuracy of Wikipedia, in which I checked Wikipedia’s entries for six topics I knew well. I was generally impressed, except for one entry that went badly wrong.

Adam Shostack pointed out, correctly, that I had left the job half done, and I needed to compare to the entries for the same six topics in a traditional encyclopedia. So here’s my Wikipedia vs. Britannica Online comparison, for the six topics I wrote about on Friday.

Princeton University: Both entries are accurate and reasonably well written. Wikipedia has more information. Verdict: small advantage to Wikipedia.

Princeton Township: Britannica has a single entry for Princeton Township and Princeton Boro, while Wikipedia has separate entries. Both entries are good, but Wikipedia has more information (including demographics). Also, Britannica makes an error in saying that Morven is the Governor’s Residence for the state of New Jersey; Wikipedia correctly labels Morven as the former Governor’s Residence and Drumthwacket as the current one. Verdict: advantage to Wikipedia.

Me: Wikipedia has a short but decent entry; Britannica, unsurprisingly, has nothing. Verdict: advantage Wikipedia.

Virtual memory: Wikipedia has a pretty good entry; Britannica has no entry for virtual memory, and doesn’t appear to discuss the concept elsewhere, either. Verdict: advantage Wikipedia.

Public-key cryptography: Good, accurate entries in both. Verdict: toss-up.

Microsoft antitrust case: Britannica has only two sentences, saying that Judge Jackson ruled against Microsoft and ordered a breakup, and that the Court of Appeals overturned the breakup but agreed that Microsoft had broken the law. That’s correct, but it leaves out the settlement. Wikipedia’s entry is much longer but error-prone. Verdict: big advantage to Britannica.

Overall verdict: Wikipedia’s advantage is in having more, longer, and more current entries. If it weren’t for the Microsoft-case entry, Wikipedia would have been the winner hands down. Britannica’s advantage is in having lower variance in the quality of its entries.

Wikipedia Quality Check

There’s been an interesting debate lately about the quality of Wikipedia, the free online encyclopedia that anyone can edit.

Critics say that Wikipedia can’t be trusted because any fool can edit it, and because nobody is being paid to do quality control. Advocates say that Wikipedia allows domain experts to write entries, and that quality control is good because anybody who spots an error can correct it.

The whole flap was started by a minor newspaper column. The column, like much of the debate, ignores the best evidence in the Wikipedia-quality debate: the content of Wikipedia. Rather than debating, in the abstract, whether Wikipedia would be accurate, why don’t we look at Wikipedia and see?

I decided to take a look and see how accurate Wikipedia is. I looked at its entries on things I know very well: Princeton University, Princeton Township, myself, virtual memory (a standard but hard-to-explain bit of operating-system technology), public-key cryptography, and the Microsoft antitrust case.

The entries for Princeton University and Princeton Township were excellent.

The entry on me was accurate, but might be criticized for its choice of what to emphasize. When I first encountered the entry, my year of birth was listed as “[1964 ?]”. I replaced it with the correct year (1963). It felt a bit odd to be editing an encyclopedia entry on myself, but I managed to limit myself to a strictly factual correction.

The technical entries, on virtual memory and public-key cryptography, were certainly accurate, which is a real achievement. Both are backed by detailed technical information that probably would not be available at all in a conventional encyclopedia. My only criticism of these entries is that they could do more to make the concepts accessible to non-experts. But that’s a quibble; these entries are certainly up to the standard of typical encyclopedia writing about technical topics.

So far, so good. But now we come to the entry on the Microsoft case, which was riddled with errors. For starters, it got the formal name of the case (U.S. v. Microsoft) wrong. It badly mischaracterized my testimony, it got the timeline of Judge Jackson’s rulings wrong, and it made terminological errors such as referring to the DOJ as “the prosecution” rather than the “the plaintiff”. I corrected two of these errors (the name of the case, and the description of my testimony), but fixing the whole thing was too big an effort.

Until I read the Microsoft-case page, I was ready to declare Wikipedia a clear success. Now I’m not so sure. Yes, that page will improve over time; but new pages will be added. If the present state of Wikipedia is any indication, most of them will be very good; but a few will lead high-school report writers astray.

Skylink, and the Reverse Sony Rule

This week the Federal Circuit court ruled that Chamberlain, a maker of garage door openers, cannot use the DMCA to stop Skylink, a competitor, from making universal remote controls that can operate Chamberlain openers. This upholds a lower court decision. (Click here for backstory.)

This is an important step in the legal system’s attempt to figure out what the DMCA means, and there has been much commentary in the blogosphere. Here is my take.

The heart of the decision is the court’s effort to figure out what exactly Congress intended when it passed the DMCA. Chamberlain’s argument was that the plain language of the DMCA gave it the right to sue anybody who made an interoperable remote control. The lower court ruled against Chamberlain, essentially because the outcome urged by Chamberlain would be ridiculous. (It would imply, for instance, that Chamberlain customers did not have the right to open their own garage doors without Chamberlain’s permission.) But the lower court had trouble finding a DMCA-based legal argument to support its conclusion. The appeals court now presents such an argument.

The court’s problem is how to resolve the tension between the parts of the DMCA that seem to uphold the traditional rights of users, such as fair use and interoperation, and the parts that seem to erode those rights. Previous courts have tried to ignore that tension, but this court faces it and tries to find a balance. The acknowledgement of this tension, and the court’s description of the very real harms of construing the DMCA too broadly, provide DMCA opponents with their favorite parts of the opinion.

For most of the opinion, before veering away at the last minute, the court seems to be heading toward a kind of reverse Sony rule. The original Sony rule, laid down by the Supreme Court in 1984, says that making and selling dual-use tools – tools that have both significant infringing uses and significant non-infringing uses – does not constitute contributory copyright infringement. (Selling tools that have only non-infringing uses is obviously lawful, and selling tools that have only infringing uses is contributory infringement.)

A reverse Sony rule would say that dual-use tools are DMCA violations, if they are circumvention tools (according to the DMCA’s definition). In flirting with the reverse Sony rule, the court hints that Congress, in passing the DMCA, meant to revise the Sony rule because of a perceived danger that future circumvention tools would tip the copyright balance too far against copyright owners. In other words, such a rule would say that the purpose of the DMCA anti-circumvention provisions was to reverse the Sony rule, but only for circumvention tools; the original Sony rule would still hold for non-circumvention tools.

In the end, the court backs away from the simple reverse-Sony interpretation of the DMCA, and makes a more limited finding that (1) tools whose only significant uses are non-infringing cannot violate the DMCA, and (2) in construing the DMCA, courts should balance the desire of Congress to protect the flanks of copyright owners’ rights, against users’ rights such as fair use and interoperation. In this case, the court said, the balancing test was easy, because Chamberlain’s rights as a copyright owner (e.g., the right to prevent infringing copying of Chamberlain’s software) were not at all threatened by Skylink’s behavior, so one side of the balancing scale was just empty. The court’s decision to leave us with a balancing test, rather than a more specific rule, seems motivated by caution, which seems like a wise approach in dealing with uncharted legal territory.

Of course, this entire exercise is predicated on the assumption that Congress had a clear idea what it meant to do in passing the DMCA. Based on what I have seen, that just isn’t the case. Many lawmakers have expressed surprise at some of the implications of the DMCA. Many seemed unaware that they were burdening research or altering the outlines of the Sony rule (and clearly some alteration in Sony took place). Many seemed to think, at the time, that the DMCA posed no threat to fair use. Partly this was caused by misunderstanding of the technology, and partly it was due to the tendency to write legislation by taking a weighted average of the positions of interest groups, rather than trying to construct a logically consistent statutory structure.

So the DMCA is still a mess. It still bans or burdens far too much legitimate activity. This court’s ruling has gone some distance toward resolving the inherent contradictions in the statute; but we still have a long, long way to go.

Venezuela Voting Analysis

Avi Rubin, Adam Stubblefield, and I just released a paper analyzing the reported voting data from the recent Venezuelan election. The paper is available at http://www.venezuela-referendum.com, in both English and Spanish versions.

Here is the “Summary” section of (the English version of) the paper:

After the August 15 referendum in Venezuela on whether or not to recall president Ch

Valenti's Greatest Hits

Over at Engadget, JD Lasica interviews outgoing MPAA head Jack Valenti. In the interview, Valenti repeats several of his classic arguments.

For example, here’s Valenti, in this week’s interview, on fair use:

Now, fair use is not in the law.

We heard this before, in Derek Slater’s 2003 interview with Valenti:

What is fair use? Fair use is not a law. There’s nothing in law.

(Somebody should send him a copy of 17 U.S.C. 107.)

Here’s Valenti, this week, on the subject of backups:

Where did this backup copy thing come from? A digital thing lasts forever.

Here he is in the 2003 interview:

[A DVD] lasts forever. It never wears out. In the digital world, we don’t need back-ups, because a digital copy never wears out. It is timeless.

(Backing up digital data is, of course, a necessary ritual of modern life. Who hasn’t lost digital data at some point?)

Interestingly, in the recent interview, unlike the 2003 one, Valenti shows a blind faith in DRM technology:

I really do believe we can stuff enough algorithms in a movie that only the dedicated hackers can spend the time and effort to try to plumb through those 1,000 algorithms to try to find a way to beat it. In time, we’ll be able to do this, because I have great faith in the technological genius that’s out there.

….

We’re trying to put in place technological magic that can combat the technological magic that allows thievery. I hope that within a year the finest brains in the IT community will come up with this stuff. A lot of people are working on it—IBM, Microsoft and maybe 10 other companies, plus the universities of Caltech and MIT, to try to find the kind of security clothing that we need to put around our movies.

It may be possible to so infect a movie with some kind of circuitry that allows people to copy to their heart’s content, but the copied result would come out with decayed fidelity with respect to sound and color. Another would be to have some kind of design in a movie that would say, ‘copy never,’ ‘copy once.’

Even ignoring the technical non sequiturs (“stuff … algorithms into a movie”; “infect a movie with … circuitry”), this is wildly implausible. Nothing has happened to make the technical prospects for DRM (anti-copying) technology any less bleak.

We can only hope Valenti’s successor stops believing in “technological magic” and instead teaches the industry to accept technical reality. File sharing cannot be wished away. The industry needs to figure out how to deal with it.