April 24, 2024

Archives for September 2004

Privacy and Toll Transponders

Rebecca Bolin at LawMeme discusses novel applications for the toll transponder systems that are used to collect highway and bridge tolls.

These systems, such as the EZ-Pass system used in the northeastern U.S., operate by putting a tag device in each car. When a car passes through a tollbooth, a reader in the tollbooth sends a radio signal to the tag. The tag identifies itself (by radio), and the system collects the appropriate toll (by credit card charge) from the tag’s owner.

This raises obvious privacy concerns, if third parties can build base stations that mimic tollbooths to collect information about who drives where.

Rebecca notes that Texas A&M engineers built a useful system that reads toll transponders at various points on Houston-area freeways, and uses the results to calculate the average traffic speed on each stretch of road. This is then made available to the public on a handy website.

The openness of the toll transponder system to third-party applications is both a blessing and a curse, since it allows good applications like the real-time traffic map, and bad applications like privacy-violating vehicle tracking.

Here’s where things get interesting. The tradeoff that Rebecca notes is not a necessary consequence of using toll transponders. It’s really the result of technical design decisions that could have been made differently. Want a toll transponder system that can’t be read usefully by third parties? We can design it that way. Want a system that allows only authorized third parties to be able to track vehicles? We can design it that way. Want a system that allows anyone to be able to tell that the same vehicle has passed two points, but without knowing which particular vehicle it was? We can design it that way, too.

Often, apparent tradeoffs in new technologies are not inherent, but could have been eliminated by thinking more carefully in advance about what the technology is supposed to do and what it isn’t supposed to do.

Even if it’s too late to change the deployed system, we can often learn by turning back the clock and thinking about how we would have designed a technology if we knew then what we know now about the technology’s implications. And on the first day of classes (e.g., today, here at Princeton) this is also a useful source of homework problems.

When Wikipedia Converges

Many readers, responding to my recent quality-check on Wikipedia, have argued that over time the entries in question will improve, so that in the long run Wikipedia will outpace conventional encyclopedias like Britannica. It seems to me that this is the most important claim made by Wikipedia boosters.

If a Wikipedia entry gets enough attention, then it will likely change over time. When the entry is new, it will almost certainly improve by adding more detail. But once it matures, it seems likely that it will reach some level of quality and then level off, executing a quality-neutral random walk, with the changes reflecting nothing more than minor substitutions of one contributor’s style or viewpoint for another.

I’d expect a similar story for Wikipedia as a whole, with early effort spent mostly on expanding the scope of the site, and later effort spent more on improving (or at least changing) existing entries. Given enough effort spent on the site, more and more entries should approach maturity, and the rate of improvement in Wikipedia as a whole should approach zero.

This leaves us with two questions: (1) Will enough effort be spent on Wikipedia to cause it to reach the quality plateau? (2) How high is the quality plateau anyway?

We can shed light on both questions by studying the evolution of individual entries over time. Such a study is possible today, since Wikipedia tracks the history of every entry. I would like to see the results of such a study, but unfortunately I don’t have time to do it myself.

Wikipedia vs. Britannica Smackdown

On Friday I wrote about my spot-check of the accuracy of Wikipedia, in which I checked Wikipedia’s entries for six topics I knew well. I was generally impressed, except for one entry that went badly wrong.

Adam Shostack pointed out, correctly, that I had left the job half done, and I needed to compare to the entries for the same six topics in a traditional encyclopedia. So here’s my Wikipedia vs. Britannica Online comparison, for the six topics I wrote about on Friday.

Princeton University: Both entries are accurate and reasonably well written. Wikipedia has more information. Verdict: small advantage to Wikipedia.

Princeton Township: Britannica has a single entry for Princeton Township and Princeton Boro, while Wikipedia has separate entries. Both entries are good, but Wikipedia has more information (including demographics). Also, Britannica makes an error in saying that Morven is the Governor’s Residence for the state of New Jersey; Wikipedia correctly labels Morven as the former Governor’s Residence and Drumthwacket as the current one. Verdict: advantage to Wikipedia.

Me: Wikipedia has a short but decent entry; Britannica, unsurprisingly, has nothing. Verdict: advantage Wikipedia.

Virtual memory: Wikipedia has a pretty good entry; Britannica has no entry for virtual memory, and doesn’t appear to discuss the concept elsewhere, either. Verdict: advantage Wikipedia.

Public-key cryptography: Good, accurate entries in both. Verdict: toss-up.

Microsoft antitrust case: Britannica has only two sentences, saying that Judge Jackson ruled against Microsoft and ordered a breakup, and that the Court of Appeals overturned the breakup but agreed that Microsoft had broken the law. That’s correct, but it leaves out the settlement. Wikipedia’s entry is much longer but error-prone. Verdict: big advantage to Britannica.

Overall verdict: Wikipedia’s advantage is in having more, longer, and more current entries. If it weren’t for the Microsoft-case entry, Wikipedia would have been the winner hands down. Britannica’s advantage is in having lower variance in the quality of its entries.

Wikipedia Quality Check

There’s been an interesting debate lately about the quality of Wikipedia, the free online encyclopedia that anyone can edit.

Critics say that Wikipedia can’t be trusted because any fool can edit it, and because nobody is being paid to do quality control. Advocates say that Wikipedia allows domain experts to write entries, and that quality control is good because anybody who spots an error can correct it.

The whole flap was started by a minor newspaper column. The column, like much of the debate, ignores the best evidence in the Wikipedia-quality debate: the content of Wikipedia. Rather than debating, in the abstract, whether Wikipedia would be accurate, why don’t we look at Wikipedia and see?

I decided to take a look and see how accurate Wikipedia is. I looked at its entries on things I know very well: Princeton University, Princeton Township, myself, virtual memory (a standard but hard-to-explain bit of operating-system technology), public-key cryptography, and the Microsoft antitrust case.

The entries for Princeton University and Princeton Township were excellent.

The entry on me was accurate, but might be criticized for its choice of what to emphasize. When I first encountered the entry, my year of birth was listed as “[1964 ?]”. I replaced it with the correct year (1963). It felt a bit odd to be editing an encyclopedia entry on myself, but I managed to limit myself to a strictly factual correction.

The technical entries, on virtual memory and public-key cryptography, were certainly accurate, which is a real achievement. Both are backed by detailed technical information that probably would not be available at all in a conventional encyclopedia. My only criticism of these entries is that they could do more to make the concepts accessible to non-experts. But that’s a quibble; these entries are certainly up to the standard of typical encyclopedia writing about technical topics.

So far, so good. But now we come to the entry on the Microsoft case, which was riddled with errors. For starters, it got the formal name of the case (U.S. v. Microsoft) wrong. It badly mischaracterized my testimony, it got the timeline of Judge Jackson’s rulings wrong, and it made terminological errors such as referring to the DOJ as “the prosecution” rather than the “the plaintiff”. I corrected two of these errors (the name of the case, and the description of my testimony), but fixing the whole thing was too big an effort.

Until I read the Microsoft-case page, I was ready to declare Wikipedia a clear success. Now I’m not so sure. Yes, that page will improve over time; but new pages will be added. If the present state of Wikipedia is any indication, most of them will be very good; but a few will lead high-school report writers astray.

Skylink, and the Reverse Sony Rule

This week the Federal Circuit court ruled that Chamberlain, a maker of garage door openers, cannot use the DMCA to stop Skylink, a competitor, from making universal remote controls that can operate Chamberlain openers. This upholds a lower court decision. (Click here for backstory.)

This is an important step in the legal system’s attempt to figure out what the DMCA means, and there has been much commentary in the blogosphere. Here is my take.

The heart of the decision is the court’s effort to figure out what exactly Congress intended when it passed the DMCA. Chamberlain’s argument was that the plain language of the DMCA gave it the right to sue anybody who made an interoperable remote control. The lower court ruled against Chamberlain, essentially because the outcome urged by Chamberlain would be ridiculous. (It would imply, for instance, that Chamberlain customers did not have the right to open their own garage doors without Chamberlain’s permission.) But the lower court had trouble finding a DMCA-based legal argument to support its conclusion. The appeals court now presents such an argument.

The court’s problem is how to resolve the tension between the parts of the DMCA that seem to uphold the traditional rights of users, such as fair use and interoperation, and the parts that seem to erode those rights. Previous courts have tried to ignore that tension, but this court faces it and tries to find a balance. The acknowledgement of this tension, and the court’s description of the very real harms of construing the DMCA too broadly, provide DMCA opponents with their favorite parts of the opinion.

For most of the opinion, before veering away at the last minute, the court seems to be heading toward a kind of reverse Sony rule. The original Sony rule, laid down by the Supreme Court in 1984, says that making and selling dual-use tools – tools that have both significant infringing uses and significant non-infringing uses – does not constitute contributory copyright infringement. (Selling tools that have only non-infringing uses is obviously lawful, and selling tools that have only infringing uses is contributory infringement.)

A reverse Sony rule would say that dual-use tools are DMCA violations, if they are circumvention tools (according to the DMCA’s definition). In flirting with the reverse Sony rule, the court hints that Congress, in passing the DMCA, meant to revise the Sony rule because of a perceived danger that future circumvention tools would tip the copyright balance too far against copyright owners. In other words, such a rule would say that the purpose of the DMCA anti-circumvention provisions was to reverse the Sony rule, but only for circumvention tools; the original Sony rule would still hold for non-circumvention tools.

In the end, the court backs away from the simple reverse-Sony interpretation of the DMCA, and makes a more limited finding that (1) tools whose only significant uses are non-infringing cannot violate the DMCA, and (2) in construing the DMCA, courts should balance the desire of Congress to protect the flanks of copyright owners’ rights, against users’ rights such as fair use and interoperation. In this case, the court said, the balancing test was easy, because Chamberlain’s rights as a copyright owner (e.g., the right to prevent infringing copying of Chamberlain’s software) were not at all threatened by Skylink’s behavior, so one side of the balancing scale was just empty. The court’s decision to leave us with a balancing test, rather than a more specific rule, seems motivated by caution, which seems like a wise approach in dealing with uncharted legal territory.

Of course, this entire exercise is predicated on the assumption that Congress had a clear idea what it meant to do in passing the DMCA. Based on what I have seen, that just isn’t the case. Many lawmakers have expressed surprise at some of the implications of the DMCA. Many seemed unaware that they were burdening research or altering the outlines of the Sony rule (and clearly some alteration in Sony took place). Many seemed to think, at the time, that the DMCA posed no threat to fair use. Partly this was caused by misunderstanding of the technology, and partly it was due to the tendency to write legislation by taking a weighted average of the positions of interest groups, rather than trying to construct a logically consistent statutory structure.

So the DMCA is still a mess. It still bans or burdens far too much legitimate activity. This court’s ruling has gone some distance toward resolving the inherent contradictions in the statute; but we still have a long, long way to go.