July 19, 2024

Archives for June 2004


Today I’ll be speaking on a panel at the USENIX Conference in Boston, on “The Politicization of [Computer] Security.” The panel is 10:30-noon, Eastern time. The other panelists are Jeff Grove (ACM), Gary McGraw (Cigital), and Avi Rubin (Johns Hopkins).

If you’re attending the panel, feel free to provide real-time narration/feedback/discussion in the comments section of this post. I’ll be reading the comments periodically during the panel, and I’ll encourage the other panelists to do so too.

Victims of Spam Filtering

Eric Rescorla wrote recently about three people who must have lots of trouble getting their email through spam filters: Jose Viagra, Julia Cialis, and Josh Ambien. I feel especially sorry for poor Jose, who through no fault of his own must get nothing but smirks whenever he says his name.

Anyway, this reminded me of an interesting problem with Bayesian spam filters: they’re trained by the bad guys.

[Background: A Bayesian spam filter uses human advice to learn how to recognize spam. A human classifies messages into spam and non-spam. The Bayesian filter assigns a score to each word, depending on how often that word appears in spam vs. non-spam messages. Newly arrived messages are then classified based on the scores of the words they contain. Words used mostly in spam, such as “Viagra”, get negative scores, so messages containing them tend to get classified as spam. Which is good, unless your name is Jose Viagra.]

Many spammers have taken to lacing their messages with sections of “word salad” containing meaningless strings of innocuous-looking words, in the hopes that the word salad will trigger positive associations in the recipient’s Bayesian filter.

Now suppose a big spammer wanted to poison a particular word, so that messages containing that word would be (mis)classified as spam. The spammer could sprinkle the target word throughout the word salad in his outgoing spam messages. When users classified those messages as spam, the targeted word would develop a negative score in the users’ Bayesian spam filters. Later, messages with the targeted word would likely be mistaken for spam.

This attack could even be carried out against a particular targeted user. By feeding that user a steady diet of spam (or pseudo-spam) containing the target word, a malicious person could build up a highly negative score for that word in the targeted user’s filter.

Of course, this won’t work, or will be less effective, for words that have appeared frequently in a user’s legitimate messages in the past. But it might work for a word that is about to become more frequent, such as the name of a person in the news, or a political party. For example, somebody could have tried to poison “Fahrenheit” just before Michael Moore’s movie was released, or “Whitewater” in the early days of the Clinton administration.

There is a general lesson here about the use of learning methods in security. Learning is attractive, because it can adapt to the bad guys’ behavior. But the fact that the bad guys are teaching the system how to behave can also be a serious drawback.

"Tech" Lobbyists Slow to Respond to Dangerous Bills

Dan Gillmor, among others, bemoans the lack of effective lobbying by technology companies. Exhibit A is their weak and disorganized response to various bills, such as the Hatch INDUCE/IICA Act, that would give the movie and music industries veto power over the development of new technology. It’s true that large tech companies have been slow and clumsy in addressing these issues; but that’s not the whole story.

The other part of the story is that the interests of a few large tech companies don’t necessarily coincide with those of the technology industry as a whole, or of the users of technology. Giving the entertainment industry a veto over new technologies would have two main effects: it would slow the pace of technical innovation, and it would create barriers to entry in the tech markets. Incumbent companies may be perfectly happy to see slower innovation and higher barriers to entry, especially if the entertainment-industry veto contained some kind of grandfather clause, either implicit or explicit, that allowed incumbent products to stay in the market – as seems likely should such a veto be imposed.

Just to be clear, an entertainment-industry veto would surely hurt the tech incumbents. It’s just that it would hurt their upstart competitors more. So it’s not entirely surprising that the incumbents would have some mixed feelings about veto proposals, though it is disappointing that the incumbents aren’t standing up for the industry as a whole.

What can be done about this? I don’t see an easy answer. In Washington, it seems to be standard procedure to mistake the voices of a few incumbents for those of a whole industry. Certainly, the incumbents have no interest in contradicting that assumption. Our best hope is that the incumbents will see it in their own long-term interest to foster a fast-moving, highly competitive industry.

Minimum Age for Pro Basketball?

Yesterday was the NBA draft. In the first round, eight high school seniors were taken, and only five college seniors. (The rest were overseas players and college underclassmen.) The very first pick was a high school senior, chosen over a very accomplished college player.

You have to be 16 to drive. You have to be 21 at drink alcohol (at least where I live). Should there be a minimum age for playing professional basketball? NBA commissioner David Stern favors a minimum age of 20 for NBA players. The NFL’s rule, banning players less than three years out of high school, withstood a court challenge from Maurice Clarett, who wanted to go pro after two years of college.

Nobody can argue, after seeing Kobe Bryant, Kevin Garnett, and LeBron James, that college is a prerequisite for NBA stardom. Sure, some high-school draftees wash out, but they may well have failed just as badly had they spent four years playing college ball.

Stern, and other proponents of the minimum age rule, argue that going to college is good for these kids. That’s probably true, if they become real students. But it’s hard to see the point in making them pretend to be students, which is what many of them would do were it not for the straight-to-the-pros path. It’s especially hard to see the point of making them mark time as pseudo-students until they pass some arbitrary age threshold, at which point they can drop their pseudo-education like a red-hot brick and jump to the pros.

Another, considerably more cynical, argument for an age limit is that forcing kids to play college sports is a clever way to subsidize university education. If college basketball is just minor-league pro ball with unpaid players, then it can serve as a profit center for universities, generating revenue to support other students who are actually being educated.

But all of this ignores the biggest losers in the trend towards professionalization of college sports: the true student-athletes. These are the players who don’t spend all day in the weight room, who study things other than game films. It’s very hard for them to compete against full-time athletes, and so they face intense pressure to slack on their studies.

It seems to me that professional football and basketball could learn a thing or two from baseball. The normal path in baseball has been for players to turn pro immediately after high school, with only a few players

RIAA Blowing Smoke About INDUCE Act

Today’s New York Times runs a brief story by Matt Richtel and Tom Zeller, Jr. on the growing criticism of Sen. Hatch’s INDUCE Act (now given a less bizarre name, and a new acronym, IICA).

Sellers of clearly legitimate products, such as those in telecom and electronics industries, argue that the bill is too broad.

The RIAA shoots back with this:

But Mitch Bainwol, chief executive of the Recording Industry Association of America, a recording industry lobbying group, said the legislation was meant to be narrowly tailored to address companies that build technology focused on illegal file sharing.

The RIAA is just wrong here. There is nothing in the bill that limits it to companies. There is nothing that limits it to technology. There is nothing that limits it to file sharing. Any of those limits could have been written into the bill – but they weren’t. The language of the bill is deliberately broad, and it appears to be deliberately vague as well.

Advocates of the Act have said little if anything to justify its breadth. This will be a key issue in the debate over the bill, if any serious debate is allowed to occur.