November 24, 2024

Spam is Back

A quiet trend broke into the open today, when the New York Times ran a story by Brad Stone on the recent increase in email spam. The story claims that the volume of spam has doubled in recent months, which seems about right. Many spam filters have been overloaded, sending system administrators scrambling to buy more filtering capacity.

Six months ago, the conventional wisdom was that we had gotten the upper hand on spammers by using more advanced filters that relied on textual analysis, and by identifying and blocking the sources of spam. One smart venture capitalist I know declared spam to be a solved problem.

But now the spammers have adopted new tactics: sending spam from botnets (armies of compromised desktop computers), sending images rather than text, adding randomly varying noise to the messages to make them harder to analyze, and providing fewer URLs in messages. The effect of these changes is to neutralize the latest greatest antispam tools; and so the spammers are pulling back ahead, for now.

In the long view, not much has changed. The arms race will continue, with each side deploying new tricks in response to the other side’s moves, unless one side is forced out by economics, which looks unlikely.

To win, the good guys must make the cost of sending a spam message exceed the expected payoff from that message. A spammer’s per-message cost and payoff are both very small, and probably getting smaller. The per-message payoff is probably decreasing as spammers are forced to new payoff strategies (e.g., switching from selling bogus “medical” products to penny-stock manipulation). But their cost to send a message is also dropping as they start to use other people’s computers (without paying) and those computers get more and more capable. Right now the cost is dropping faster, so spam is increasing.

From the good guys’ perspective, the cost of spam filtering is increasing. Organizations are buying new spam-filtering services and deploying more computers to run them. The switch to image-based spam will force filters to use image analysis, which chews up a lot more computing power than the current textual analysis. And the increased volume of spam will make things even worse. Just as the good guys are trying to raise the spammers’ costs, the spammers’ tactics are raising the good guys’ costs.

Spam is growing problem in other communication media too. Blog comment spam is rampant – this blog gets about eight hundred spam comments a day. At the moment our technology is managing them nicely (thanks to akismet), but that could change. If the blog spammers get as clever as the email spammers, we’ll be in big trouble.

For Once, BCS Controversy Not the Computers' Fault

It’s that time of year again. You know, the time when sports pundits bad-mouth the Bowl Championship Series (BCS) for picking the wrong teams to play in college football’s championship game. The system is supposed to pick the two best teams. This year it picked Ohio State, clearly the best team, and Florida, a controversial choice given that Michigan arguably had better results.

Something like this happens every year. What makes this year different is that for once it’s not being blamed on computers.

BCS uses a numerical formula combining rankings from various sources, including human polls and computerized rankings. In past years, the polls and computers differed slightly. The problem generally was that the computers missed the important nuances that human voters see. Computers didn’t know that games at the beginning of the year count much less, or that last year’s ranking is supposed to influence this year’s, or that games count more if they’re nationally televised, or that there’s a special bonus for Notre Dame or a retiring coach. And so the computers and humans sometimes disagreed.

Human pundits sided unsurprisingly with the humans. The computer pundits all sided with the computers, but without an effective talk radio presence they were shouted down.

This year the computers cleverly ducked responsibility by rating Florida and Michigan exactly even, thereby forcing humans to take the heat for picking one or the other. The humans picked Florida. Problem was, the humans had previously rated Michigan above Florida but somehow flipped the two at the end, on the basis of not much new evidence (Florida performing as expected against a good opponent). The bottom line was simple: an Ohio State-Florida game would be cooler than an Ohio State-Michigan one – yet another factor the computers didn’t know about.

Since this year’s controversy is the humans’ fault, will the computers be given more weight next year? Don’t count on it.

NIST Recommends Not Certifying Paperless Voting Machines

In an important development in e-voting policy, NIST has issued a report recommending that the next-generation federal voting-machine standards be written to prevent (re-)certification of today’s paperless e-voting systems. (NIST is the National Institute of Standards and Technology, a government agency, previously called the National Bureau of Standards, that is a leading source of independent technology expertise in the U.S. government.) The report is a recommendation to another government body, the Technical Guidelines Development Committee (TGDC), which is drafting the 2007 federal voting-machine standards. The new report is notable for its direct tone and unequivocal recommendation against unverifiable paperless voting systems, and for being a recommendation of NIST itself and not just of the report’s individual authors.

[UPDATE (Dec. 2): NIST has now modified the document’s text, for example by removing the “NIST recommends…” language in some places and adding a preface saying it is only a discussion draft.]

The key concept in the report is software independence.

A voting system is software-independent if a previously undetected change or error in its software cannot cause an undetectable change or error in an election outcome. In other words, it can be positively determined whether the voting system’s (typically, electronic) CVRs [cast-vote records] are accurate as cast by the voter or in error.

This gets to the heart of the problem with paperless voting: we can’t be sure the software in the machines on election day will work as expected. It’s difficult to tell for sure which software is present, and even if we do know which software is there we cannot be sure it will behave correctly. Today’s paperless e-voting systems (known as DREs) are not software-independent.

NIST does not known how to write testable requirements to make DREs secure, and NIST’s recommendation to the STS [a subcommittee of the TGDC] is that the DRE in practical terms cannot be made secure. Consequently, NIST and the STS recommend that [the 2007 federal voting standard] should require voting systems to be [software independent].

In other words, NIST recommends that the 2007 standard should be written to exclude DREs.

Though the software-independence requirement and condemnation of DREs as unsecureable will rightly get most of the attention, the report makes three other good recommendations. First, attention should be paid to improving the usability and accessibility of voting systems that use paper. Second, the 2007 standard should include high-level discussion of new approaches to software independence, such as fancy cryptographic methods. Third, more research is needed to develop new kinds of voting technologies, with special attention paid to improving usability.

Years from now, when we look back on the recent DRE fad with what-were-we-thinking hindsight, we’ll see this NIST report as a turning point.

Duck Amuck and the Takedown Gun

I wrote last week (1, 2) about the CopyBot tool in Second Life, which can make an exact lookalike copy of any object, and the efforts of users to contain CopyBot’s social and economic effects. Attempts to stop CopyBot by technology will ultimately fail – in a virtual world, anything visible is copyable – so attention will turn, inevitably, to legal tactics.

One such tactic is the DMCA takedown notice. Second Life lets users keep the copyright in virtual objects they create, so the creator of a virtual object has a legal right to stop others from copying it (with standard exceptions such as fair use). The Digital Millennium Copyright Act (DMCA), among its other provisions, exempts service providers such as Second Life from liability for copyrighted stuff posted by users, provided that Second Life implements the DMCA’s notice and takedown procedure. Under this procedure, if you see an infringing copy of your material on Second Life, you can send a notice containing certain information to Second Life, and they have to respond by taking down the accused material. (For further details consult your neighborhood copyright lawyer.)

Let’s apply this to a specific example. Alice designs a spiffy new hot air balloon that everyone covets. Bob uses CopyBot to make his own replica of the balloon, which he starts riding around the skies. Alice discovers this and sends a takedown notice to Second Life. Bob’s balloon is then “taken down” – it disappears from the world, as in the classic cartoon Duck Amuck, where the animator’s eraser plays havoc with Daffy Duck’s world.

But surely Bob isn’t the only one riding in a copied balloon. Others may have CopyBotted their own balloons or bought a balloon copy from Bob. It’s tedious for Alice to write and send a takedown notice every time she sees a copied balloon.

What Alice needs is a takedown gun. When she sees an infringing balloon, she just points the takedown gun at it and pulls the trigger. The takedown gun does the rest, gathering the necessary information and sending a takedown notice, dooming the targeted balloon to eventual destruction. It’s perfectly feasible to create a takedown gun, thanks to Second Life’s rich tools for object creation. It’s a gun that shoots law rather than bullets.

For extra style points, Alice can program the gun so that it refuses to shoot at balloons that she herself built. To do this, she programs the gun, before it fires, to issue a cryptographic challenge to the balloon. Authorized balloons will know a secret key that allows them to respond correctly to the challenge. But unauthorized copies of the balloon won’t know the key, because the key is built into the object’s scripted behavior, which CopyBot can’t duplicate. (Exercise for computer security students: how exactly would this protocol work?)

But of course there is a small problem with abuse of takedown guns. To send a takedown notice, the law says you must be (or represent) the copyright owner and you must have a good faith belief that the targeted object is infringing. Alice might be careful to shoot the gun only at objects that appear to infringe her copyright; but others might not be so careful. Indiscriminate use of a takedown gun will get you in legal trouble for sending bogus takedown notices.

Initially, the management at Second Life pointed to takedown notices as a response to CopyBot-based infringement. More recently, they have shifted their position a bit, saying that infringement violates their Terms of Use and threatening to expel violators from Second Life. They still face the same problem, though. Presumably their enforcement actions will be driven by user complaints, which motivates Alice to make a complaint gun.

As the music industry has learned, when copying is easy, laws against copying are very hard to enforce.

DMCA Exemptions Granted

Last Wednesday afternoon the U.S. Copyright Office released its list of DMCA exemptions for the next three years. The timing is interesting: releasing news in the afternoon of the day before Thanksgiving is a near-optimal strategy if you want that news to escape notice and coverage in the U.S.

The purpose of these exemptions are to prevent harm to the public from overbreadth of the DMCA’s prohibition on circumventing technologies that control access to copyrighted works. Exemptions last three years.

The good news that that six exemptions were granted, the most ever:

  • Professors can make compilations of film and video material for research or teaching.
  • Archivists can preserve copies of old programs and computer games.
  • Anyone can work around broken hardware “dongles” that prevent access to software programs.
  • Blind people can use software to have e-books read aloud.
  • Wireless phone customers can switch their phones to a different wireless provider.
  • Anyone can study, test, or remove malware distributed on CDs.

(These are summaries; the exact scope of each exemption is detailed in the original document.)

I’m particularly happy about the last exemption, which was requested by Alex Halderman and me, with lots of help from Deirdre Mulligan and Aaron Perzanowski. The exemption is narrower than I would have liked – plenty of valuable research still raises legal issues – but it’s good to see official recognition that the DMCA has harmed research.

The not-so-good news is in some of the exemptions that were not granted. The exemption for censorware research was not renewed, mostly because its most effective advocates, such as Seth Finkelstein, got tired of re-requesting it. (Even if nothing has changed, each exemption must be rerequested every three years through the same bureaucratic process – one example of how the playing field is tilted against exemptions.)

Also, exemptions for space-shifting (e.g. downloading content into portable players like iPods) and backing up digital media were denied. As usual, the Copyright Office pretended not to know what everybody else seems to know, e.g. that digital media are fragile and need to be backed up.

On the other hand, they did seem to recognize the DMCA’s harm to public discourse. The exemptions for film scholarship, archiving, access by the blind, and malware research all address harms to public debate caused by the DMCA. Fair use is sometimes broken down into two categories: transformative uses such as scholarship, research and parody; and personal uses such as time-shifting and space-shifting. The Copyright Office now seems to recognize that the DMCA is harming transformative use.

But what they don’t yet see, apparently, is the harm to personal use – hence the denial of the space-shifting and backup requests. Worse yet, they didn’t even acknowledge that these personal uses are lawful in the first place. In short, the Copyright Office still isn’t willing to grapple with the issues of most direct interest to the public. Maybe they’ll catch on three years from now, or six. Or maybe the new Congress will act sooner and reform the DMCA.

(Derek Slater has a nice summary of some other commentary.)