December 14, 2024

attack of the context-sensitive blog spam?

I love spammers, really I do. Some of you may recall my earlier post here about freezing your credit report. In the past week, I’ve deleted two comments that were clearly spam and that made it through Freedom to Tinker’s Akismet filter. Both had generic, modestly complementary language and a link to some kind of credit card application processing site. What’s interesting about this? One of two things.

  1. Akismet is letting those spams through because their content is “related” to the post.
  2. Or more ominously, the spammer in question is trolling the blogosphere for “relevant” threads and is then inserting “relevant” comment spam.

If it’s the former, then one can certainly imagine that Akismet and other such filters will eventually improve to the point where the problem goes away (i.e., even if it’s “relevant” to a thread here, if it’s posted widely then it must be spam). If it’s the latter, then we’re in trouble. How is an automated spam catcher going to detect “relevant” spam that’s (statistically) on-topic with the discussion where it’s posted and is never posted anywhere else?