March 19, 2024

Archives for April 2008

spammers gone wild

I’m sure this sort of behavior is old news, but it’s still really annoying.  Starting last night and continuing as I’m writing this, some annoying spammer has been forging my email address as the “From” line of a variety of spams.  This is causing a staggering volume of backscatter, mostly of the “Delivery Status Notification (failure)” variety.  Sampling these messages, I’m seeing several interesting things.

  1. The spammer is using my proper email address (dwallach@…) on each message, but a different “real” name on each one.  The name “Dan Wallach” does not appear anywhere.
  2. I forward everything to Gmail.  Gmail considers all of this backscatter to be spam.  That’s probably the correct answer, but I’m not sure I want to train my own DSPAM to do the same thing.  (DSPAM runs locally, and then I save a local copy and forward to Gmail.)  If I send a real message and it legitimately bounces, I want to know about it.  If I train DSPAM that all of these delivery status notifications are spam, it will inevitably throw away anything from “mailer-daemon”.  I’m unclear on whether that’s good or bad.
  3. You could easily build a bounce-message validator.  Every backscatter seems to have the original message ID in it, somewhere.  If the backscatter mentions a message ID that my system actually generated, then the backscatter is allowed.  Otherwise it’s dropped.  (This idea appears to be a variation of VERP; I’d make the message ID be a keyed MAC of a sequence number.)
  4. A large number of these spams have a message body consisting entirely of “Take a look at yourself :)”  and linking to “video.exe” on a variety of different web sites.  Gmail helpfully rewrites those links such that they can track that I clicked on it.  This would also seem to give them an opportunity to give me an anti-virus warning, but they don’t do any such thing.  (“video.exe” is one of the common names used by the Storm worm.)
  5. Many spams include links that redirect through Google’s PageAd server to yet another server.  I clicked on one of them.  It appears that the PageAd redirector worked, but then Firefox’s “badware” detector caught the destination as being bad, ultimately taking me to stopbadware.org.  Go Firefox!
  6. Some legit antispam firewall products (including Barracuda) are helpfully telling me my message “was blocked by our Spam Firewall. The email you sent with the following subject has NOT BEEN DELIVERED”.  This is clearly broken behavior.  Just drop it and move on!
  7. Several of the backscatter messages are actually validation messages (sender address verification).  This has been largely discredited due to a variety of practical problems, never mind common-case annoyance to normal users.
  8. One of the spammers seems to be quite keen to sell replicas of expensive wristwatches, and those links take you to some kind of seemingly real online store, albeit with a funky DNS name.  Somehow, even if I did want a fake expensive watch, I’m not sure I’d be comfortable typing my credit card number into a web site whose name is a list of random characters and who (clearly) is closely related to the underworld of lecherous spammers.

EDIT: fixed post that had gone out before it was done.

Bizarre Undervote on iVotronic in France

In France, most municipalities use paper ballots in elections, but a few places have begun using DRE (direct-recording electronic) machines. Pierre Muller, a French computer scientist, has recently sent me a report of a malfunction by an ES&S iVotronic machine in a recent municipal election.

In this spring’s elections (and he believes this also happened last year), there have been some unexplained “undervotes” on iVotronic machines. Below is a printout from an iVotronic machine. There’s a line “UnderVotes For Above Contest: 1”. Since the voter is required by the user-interface to choose between a candidate and the choice “vote blanc” [none of the above], undervotes should not be possible.

This event is similar in some ways to the Sequoia AVC Advantage bug observed in New Jersey on February 5, 2008. In both cases it appears that the machine is producing results that should not be possible, and in both cases local election officials are unable to explain how these results could legitimately be obtained.

Here is the relevant portion of the printout:

I’ve also prepared a larger image of the full printout, annotated with my English translation.

voting ID requirements and the Supreme Court

Last week, I posted here about voter ID requirements.  There was a case pending before the U.S. Supreme Court on the same topic.  It seems Indiana was trying to require voters to present ID in order to vote.  Lawsuit.  In the end, the court found that the requirement wasn’t particularly onerous (the New York Times’s article is as good as any for a basic summary, or go straight to the ruling).

Unsurprisingly, there has been a lot of hang-wringing on this (see, for example, this New York Times unsigned editorial).  We can expect similar legislation elsewhere now that the Court has made it pretty difficult to challenge these sorts of laws (see, for example, the ongoing battle to pass this sort of legislation in Texas).

As I wrote last time, I’m not particularly opposed to voters being required to present ID.  However, ID needs to be easy to get for anybody who is elgible to vote.  For most people, this is easy.  The big question we’d all like to know is the size of the population for which it’s not easy.  Consider, as a hypothetical example, an elderly Texas woman who never drove a car.  If she’s over 75 years old, the state’s centralized birth certificate registry won’t (officially) have her records.  It could well require detective work to produce sufficient documentation to get her a state ID card.  Who’s going to pay for that?

The big technical question, of course, is whether the root desires behind the voter ID requirement can be addressed in some more effective fashion than ID requirement.  What are those root desires?

  1. Prevent legitimate citizens from registering to vote and voting in more than one locale
  2. Prevent registered voters from casting multiple votes in their own name
  3. Prevent registered voters from impersonating other registered voters
  4. Prevent anyone, including malicious poll workers, from casting votes on behalf of registered voters who have chosen not to vote
  5. Prevent non-eligible people (non-citizens, felons, etc.) from registering to vote
  6. Detect changes in registered voters’ eligibility status, quickly and accurately

Which problems can be solved by purple ink on a voter’s thumb?  #1 and #2 are readily solved, since a second attempt to vote will be forbidden.  #3 is disincentivized, because the impersonator will be unable to vote under his or her own name.  #4-6 will require other technologies.

Okay, which problems can be solved by having required voter ID?  Let’s assume, for the sake of discussion, we have a centralized state database keyed off the voter’s ID card number, but individual polling places do not have real-time access to this database.  Also, let’s assume that voter ID cards do not have any computational power: no smart cards, no crypto, etc.  #1 is ostensibly solved by the central database.  #2 cannot be prevented (at least, in a world with early voting or voting centers, where a voter has multiple places where he or she can legitimately vote), but it can be detected, and is thus disincentivized.  #3 is solved.  #4 is largely unsolved: if malicious poll workers want to forge signatures in the poll book, they may or may not be detected.  (In a recount situation, written signatures should be verified, but it’s unclear what the accuracy of that checking process might be.)

You could try to solve #4 with smartcards that issue digital signatures, but that’s a whole different can of worms.  Since the smartcard doesn’t really know what it’s being asked to sign, this could be exploited by an attacker.  (Example: you need to present your ID in a variety of different circumstances, such as proving your age to enter a bar.  The bouncer could “swipe” your card and use that as a way of getting a forged signature on an election record.)

What about #5 and #6?  These are really back-end database problems.  Requiring voters to present ID doesn’t have any impact.  However, having a database that is keyed off the voters’ ID cards significantly improves #5 and #6 and could ostensibly help reduce a variety of errors in the process.

Curiously, it seems that most of the benefit of requiring ID occurs in the back-end database, rather than on the day of the election.  The only real benefit of presenting ID, on election day, occurs in vote centers, early voting locations, and so forth.  When there may be millions of eligible voters who could use a vote center, traditional paper poll books are unworkable.  With a database keyed from ID card numbers, a voter’s records can be efficiently looked up and verified.  While this isn’t a security problem, improving the efficiency of the voting process is still a worthwhile goal.

Future of News Workshop, May 14-15 in Princeton

We’ve got a great lineup of speakers for our upcoming “Future of News” workshop. It’s May 14-15 in Princeton. It’s free, and if you register we’ll feed you lunch.

Agenda

Wednesday, May 14, 2008

9:30 – 10:45 Registration
10:45 – 11:00 Welcoming Remarks
11:00 – 12:00 Keynote talk by Paul Starr
12:00 – 1:30 Lunch, Convocation Room
1:30 – 3:00 Panel 1: The People Formerly Known as the Audience
3:00 – 3:30 Break
3:30 – 5:00 Panel 2: Economics of News
5:00 – 6:00 Reception

Thursday, May 15, 2008

8:15 – 9:30 Continental Breakfast
9:30 – 10:30 Featured talk by David Robinson
10:30 – 11:00 Break
11:00 – 12:30 Panel 3: Data Mining, Interactivity and Visualization
12:30 – 1:30 Lunch, Convocation Room
1:30 – 3:00 Panel 4: The Medium’s New Message
3:00 – 3:15 Closing Remarks

Panels

Panel 1: The People Formerly Known as the Audience:

How effectively can users collectively create and filter the stream of news information? How much of journalism can or will be “devolved” from professionals to networks of amateurs? What new challenges do these collective modes of news production create? Could informal flows of information in online social networks challenge the idea of “news” as we know it?

Panel 2: Economics of News:

How will technology-driven changes in advertising markets reshape the news media landscape? Can traditional, high-cost methods of newsgathering support themselves through other means? To what extent will action-guiding business intelligence and other “private journalism”, designed to create information asymmetries among news consumers, supplant or merge with globally accessible news?

  • Gordon Crovitz, former publisher, The Wall Street Journal
  • Mark Davis, Vice President for Strategy, San Diego Union Tribune
  • Eric Alterman, Distinguished Professor of English, Brooklyn College, City University of New York, and Professor of Journalism at the CUNY Graduate School of Journalism

Panel 3: Data Mining, Visualization, and Interactivity:

To what extent will new tools for visualizing and artfully presenting large data sets reduce the need for human intermediaries between facts and news consumers? How can news be presented via simulation and interactive tools? What new kinds of questions can professional journalists ask and answer using digital technologies?

Panel 4: The Medium’s New Message:

What are the effects of changing news consumption on political behavior? What does a public life populated by social media “producers” look like? How will people cope with the new information glut?

  • Clay Shirky, Adjunct Professor at NYU and author of Here Comes Everybody: The Power of Organizing Without Organizations.
  • Markus Prior, Assistant Professor of Politics and Public Affairs in the Woodrow Wilson School and the Department of Politics at Princeton University.
  • JD Lasica, writer and consultant, co-founder and editorial director of Ourmedia.com, president of the Social Media Group.

Panelists’ bios.

For more information, including (free) registration, see the main workshop page.

Voluntary Collective Licensing and Extortion

Reihan Salam has a new piece at Slate about voluntary collective licensing of music (which was also the topic of an online symposium organized by our center at Princeton). I’m generally a fan of Reihan’s work, but this time I think he got it wrong. His piece starts like this:

What would you do if a bully—let’s call him “Joey Giggles”—kept snatching your ice-cream cone? OK, now what if Joey Giggles then told you, “If you pay me five bucks a month, I’ll stop snatching your ice cream.” Depending on how much you hate getting beaten up, and how much you love ice-cream cones, you might decide that caving in is the way to go. This is what’s called a protection racket. It’s also potentially the new model for how we’ll buy and listen to music.

[…]

Now Big Music is mulling the Joey Giggles approach. Warner Music Group is trying to rally the rest of the industry behind a plan to charge Internet service providers $5 per customer per month, an amount that would be added to your Internet bill. In exchange, music lovers would get all the online tunes they want, meaning that anyone who spends more than $60 a year on music will come out way ahead. Download whatever you want and pay nothing! No more DRM! Swap files to your heart’s content—we promise, we won’t sue you (or snatch your ice-cream cone)!

This idea, that collective licenses amount to extortion – pay us or we’ll sue you – is often heard, but I don’t think it’s a valid criticism of collective licenses. The reason is pretty simple: if this is extortion, then all of copyright is extortion. The basic mechanism of copyright is that the creator of a work gets certain exclusive rights in the work. Exclusive rights means that there are certain things that nobody else can do with the work, without the creator’s permission. “Nobody else can do X” is another way of saying that if somebody else does X, the creator can sue them. When you buy a licensed copy of a work instead of downloading it illegally, what you’re buying is an enforceable promise that you won’t be sued (plus the knowledge that you’re playing by the rules, but that is intimately connected to the lawsuit protection). So the basic mechanism of copyright involves people paying a copyright owner for a promise not to sue them.

To put it another way, if you accept our current copyright system at all – even if you accept only a streamlined, improved version of it – then you’ve already accepted the kind of “extortion” that would be used to sell voluntary collective licenses. The only alternative is a complete redesign of the system, more complete even than a voluntary collective license.

Reihan does recommend a redesign. He endorses Terry Fisher’s suggestion of a government tax on broadband access, with the revenue used to pay musicians based on the popularity of their songs. This system has its benefits (though on balance I don’t think it’s good policy). But if you start out worried about strong-arm extraction of money from citizens, a mandatory tax scheme is an odd place to end up.

This is the fundamental problem of copyright policy in the digital age. It’s easy for people to get copyrighted works without paying. So either you forgo payment entirely, or you give somebody the mandate to collect payment. Who would you prefer: record companies or the government?