28 november 2004

comment spam: state of the arms race

My apologies for the extended absence. And in lieu of either policy rumblings or idle distractions, tonight's offering is about, well, blog maintenance—in particular, dealing with the scourge that is comment spam.

Movable Type 3.x offers two built-in comment management aids. (Three, if you count IP banning, which is all but useless when dealing with spoofers.) First is comment moderation, which Gerard Van der Leun has just adopted. The feature does keep unwanted pr0n ads from appearing on your public site, which is a plus. But for legitimate commenters this moderation is rather a buzzkill, especially if the blog author is a bit slow in checking for MT notifications. And on sites that have built up a large commenter community (like not this one) it's pretty much a conversation stopper.

That's where TypeKey comes in. In short, all commenters first must register with the service, and then sign in before posting comments. And although some big-name blogs have gone this route—Roger Simon's, for instance, and the Captain's Quarters—I really do not like it. Too centralized, for one; and the login process really is a bother. Had I not already had a TypeKey login by virtue of being a Six Apart client it's likely that I would never have registered.

The next most common approach is a Movable Type plugin called MT-Blacklist. As the name suggests, the url submitted with any given comment is checked against a master list, and if there is a match the comment is rejected.

Many people seem quite happy with Blacklist. But I'll never employ it here—for one thing, on several occasions I've attempted to leave a comment along with this site's url on MT-Blacklist enabled blogs, only to have it rejected as “objectionable”. Which is a little odd: words from Tolkien's invented tongues don't generally trip censor alarms (even considering that the Telerin form of Celeborn is, uh, Teleporno). Then there's the fact that spammer domains multiply like busy rabbits, making an already very, very long master blacklist ever longer.

So what's left? Previous visitors to this blog might have noticed my favored remedy: a Turing security code in the comment entry form, courtesy another MT plugin (this time by James Seng). If your blog's server runs linux—or some other flavor of unix—I heartily recommend it. Setup is straightforward if you are comfortable working in a unix shell, and if your webhost is willing to provide SSH access. Feel free to contact me with any questions.

Finally, a commenter at Ghost of a Flea points to this simple hack, which I also implemented here over the weekend.

The key is that spammers have automated scripts that look for Moveable Type blog sites and they then post to our comments using a direct call to the “mt-comments.cgi” script. If you installed Moveable Type into the default directory (/mt) then they know exactly where the script is and how to call it.

The solution is simple: rename the script to some odd name (ex. qwerty.cgi) and edit your mt.cfg to point to the renamed CGI script. Look for the line that is commented out and reads “# CommentScript mt-comments.cgi”. Uncomment the line and change the name of the script to the new name. You need to rebuild the site before it takes effect. Users will not be able to post comments while you are doing this but the entire process only takes a few minutes.

The beauty of this last approach is that no specialized skills are needed. If you're working on a Windows machine, simply download a copy of the mt.cfg file and edit it in Wordpad. Just be sure to save a copy of the unaltered file somewhere safe, at least until you know your site is functioning properly.

Keep in mind, though, that the simplest solutions are likely also those most vulnerable to spammer workarounds. Hence the combination of Turing security code and comments script relabeling is perhaps the best way to go—at least until the day some enterprising script kiddie invents an image-reading spambot.

An arms race indeed.

Then, of course, there are those spammers willing to leave comments the old-fashioned way: by manual entry. I do get some of those here. And for the lost soul who persists in leaving ads for payday loans and propecia from the UK: you are one sorry wanker.

Go away, or I shall taunt you a second time…

UPDATE 120104: Winds of Change has gotten hit by 20,000 spam comments in two weeks.

UPDATE 010505: Changing the name of mt-comments.cgi turns out to be a weak fix indeed. I used it when upgrading Ghost of a Flea a week or so back; the spambots adapted to the new name within hours.



comments

Well, hope this Turing plugin works. We're looking at this as a possible additional solution now, and thanks for the tip.

As the article you link notes, MT-Blacklist 2.x has real drawbacks, one of which may be high loads on MySQL servers when under sustained comment spammer floods. That's a real problem because web hosts with shared servers don't like that at all. We're not sure if Total Choice Hosting is simply using underpowered machines, or if there's a real resource issue, but it has been a big problem for us and may force us to move the blog and buy a much more expensive hosting plan.

So the issue goes beyond just porno, gambling and drug filth on the blog... it can become a real money/survival issue.

Hmm... porn, gambling, vice being promoted internationally - why can't we sic al-Qaeda on these guys, and let them behead some deserving people for a change? Just askin'...

Joe Katzman | 2 december 2004, 03:45 am | link

Either that, or have Dante's heir add a lower circle to Hell.

Anyways, hope that whatever solution you settle on scales well—WoC gets traffic that I can only dream of.

Anthony | 3 december 2004, 07:17 am | link
 

post a comment

  your e-mail address will not be displayed.