Ca(p)tching Cats and Dogs

I read on Jeff Atwood's blog about most strong Captcha having been defeated. Also, on top of visitors getting annoyed by it, the Captcha plugin I am using has gone unmantained lately. And, one way or another, I am getting comment spam again. Which is something I really hate as you know what I would love to do to spammers

I am seriously considering giving Asirra a try. It is an interesting project from Microsoft Research for an HIP (Human Interaction Proof) that uses info from petfinder.com to let users set apart pictures of dogs from those of cats. There is also a WordPress plugin, in the best and newest "we want to interoperate" fashion that we are finally getting at Microsoft (this has always been the way to go, IMHO, and BTW).

Anyway, what do you think ?

Google has pissed me off this week!

Now I pretty much liked GMail and Google in general. But this time they REALLY pissed me off! I will tell you that I am not a google-hater even if I work for a competing company. Of course not everything that Google does is wonderful, but some of their services are really cool and useful and I have never denied to say they rocked when I felt they did.
In general, people seem to love them, and their stock value shows it (with the launch of "Code Search" this week they made a lot of people scream "how cool is this" so that they got back from just under 400 dollars to 417!). But that's not the issue. That is cool, that works. It's ok they make money if they make cool tools. It's fine for me.

In fact i consider GMail as being one of the best interface for reading mail that exist out there – I love "tagging" (oops: it's called "labelling" in their syntax), speed of search through messages (even tough Outlook 2007 is faster on indexed content, but still you have to buy it and install it on your PC)… I also especially love the way it shows THREADING… so that I moved pretty much EVERY mailing list I read on their account:

Ma come se fa ?
(ok, they could do better with the localized version of "Re:" in replies…. in Italian a lot of broken MUA's translate that into "R:" and that isn't understood by GMail and will make it think it is another thread…. but that's a minor issue, and also one that every MUA handling threading has – including "mutt" – the real problem is the broken MUAs sending the "R:" in the first place. But I digress too much….).

I also keep GMail continuosly opened in a browser during the day because a lot of informative mail and that sent by friends goes there. This to say that I do get a lot of their ads (that is – the point of having such an application, for them…). On the contrary, Windows Live Mail reduced its ads to show only one… not to annoy you too much.
But the ads in GMail were not *really* a problem (I don't read them anyway, I just plain IGNORE THEM).

But this week they REALLY pissed me off. They REALLY have. And here is the reason:
I have been using a script for MONTHS to backup my database (the one powering THIS blog) and send it "off-site" to my GMail mailbox. Pretty much something like a lot of other people do, described in various articles and blog posts. Then I was labelling them with a rule, so that I could access my backups easily in case I needed them.

Now I don't know if this violates their terms of use in any way… because I am not really using it as storage with those programs that circulated at one stage that had "reverse engineered" it. Those were bypassing the web interface altogether so people did use it as storage with a program without having to see their ads. That was the issue, I think. In my case, I am just sending MAILS to myself. One per day. I also delete the old ones every now and then, and they are not even huge in sized (attachments of 40 to 50KB so far!!)… anyway, I know a lot of people that store documents and all sort of stuff even in their corporate mailboxes in Outlook (then maybe index them with Windows Desktop Search of Google Desktop to find it back)… I was only doing the same with GMail. I don't see the big issue here….. they might think otherwise…. but from what happens I don't think that's the issue.

Anyway, now it's been three or four days that my backup mail gets rejected. My SMTP Server gets told:

host gmail-smtp-in.l.google.com[66.249.83.27] said:
550-5.7.1 Our system has detected an unusual amount of unsolicited
550-5.7.1 mail originating from your IP address. To protect our
550-5.7.1 users from spam, mail sent from your IP address has been
550-5.7.1 rejected. Please visit
550-5.7.1 http://www.google.com/mail/help/bulk_mail.html to review
550 5.7.1 our Bulk Email Senders Guidelines.

Now for fuck's sake. You know how much I hate SPAMMERS and what I would like to do with them. But I also know that it does happen to end up in RBLs and such sometimes. Fine. But GIVE ME a way to tell you that I am NOT one! If you go to the link above, all you find is a form where you can specify that mail that ended up in your "junk" folder actually wasn't spam. Yeah, right. In my case it does not even go into my "junk" folder! How am I supposed to give me the original header that arrived to THEM if I only have the one sent by my mailserver ? They just blacklisted my mail server's IP Address! As they say, I even have an SPF record, I always use the same address, etc….
So I tried to fill in the form, the day after I also tried to contact their abuse@google.com and abuse@gmail.com addresses.
Still nothing.
They even tell you (in the automated reply when you contact "abuse":
"[…] For privacy and security reasons, we may not reveal the final outcome of an abuse case to the person who reported it. […]".
How great. How am I supposed to know if they even READ my complaint ?

You anti-spam people at GMail: "I am NOT a fucking spammer!!!!!". I 'haven't found a better way to tell ya this, you know, than writing it on my blog… this is just RIDICULOUS!

But to date my mails still get dropped. I'll probably have to send my backups somewhere else. At this point they pissed me off so much that I am also seriously considering getting back to use my own mailserver also for receiving and reading my mailing lists. Then I won't get ads there.
Afzetterij!
(I hope you have some dutch guy on board at Google, as "Google Translate" does not translate from/to dutch yet…. )

Edited on October, 8th – While GMail REJECTS those mails (it SAYS it is not accepting them), Hotmail simply DROPS them (that is: it does not even SAY it is not accepting them):

to=, relay=mx4.hotmail.com[65.54.245.104], delay=3, status=sent (250 <20061008061010.GA19807@muscetta.com> Queued mail for delivery)

This way you THINK it is going to be delivered, but it NEVER shows up in your inbox. I don't know who's behaving the worst…

How programs can teach each other

This article shows an intersting (interesting because it is simple but effective!) approach to train SpamAssassing Bayesian spam filter by leveraging the training data in Thunderbird bayesian filter. Basically you can use a program to teach another program how to work better!
This paradigm is cool!

Trackback Spam

Oh I hate spammers, you know ? In fact I've also got this goal I would like to mark as "done"….
…but that's more for laughing than to be serious, really.

Coming to comment spam, I've been dealing quite a lot with the old 'b2' (WordPress's progenitor) at one stage, while I could not be asked to upgrade yet. At one stage I'd even coded my own unofficial fix for it to keep it going and mantain my sanity…

Then with WordPress I've enabled a CAPTCHA plugin which takes care of robots and only lets HUMANS place comments.

But now it's the turn of trackback spamming….
Sure, a lot of people have seen it AGES before me, simply because people DO read THEIR blog more than mine….
In a way, this might mean this is starting to be read – gosh! Who makes you read this ? Are you really THAT bored to get to read me?

Anyway, here's a couple of useful links proposing approaches to tackle comment and trackback spam. They might be useful to you too:
http://www.tamba2.org.uk/wordpress/spam/
http://photomatt.net/2005/01/05/trackback-spam/

Also now, I could get some of those plug-ins…. probably. For now I don't have time to test the plug-ins, so I've just hacked my own fix, see if it does. Probably I will have to 'touch' it again, as I might have broken the trackback feature altogether. Well, it will pretty much test itself. Spammers, where are you now ? I'm watching my logs, please try….

[edited: 20th May 2006 – Ok they did send trackbacks tonight and my fix did work :-)]

Annoying Spammer – see if they like this…

Well, since I was quite busy cleaning and cleaning their stupid comments over and over again, over all of the old posts, I finally found some time the other day to put my hands in the code of this blog, and implement some checks on dates, so that the older posts are not "commentable" anymore – well, I should refine it, as the link to comment is there anyway… but it checks when you submit and it tells you to buzz off…. so I bet you can comment-spam me on this new post but at least I won't have to go through all of the old ones (which was a very tiring task…).

Annoying spammer and lame defacers – part three

I am sorry but, since they did it again, I have removed the possibility to post HTML tags into a comment – this way, if the reason of their idiotic comments was that of increasing their ranking in Google, this won't at least be accomplished.

I prefer to leave anonymus posting capabilities to my visitors, but I don't like helping spammers doing their crap.

Annoying spammers and lame defacers

I just realized that some guy, believing to be funny, used my "comment" link under the blog posts to fill in a lot of crap – some sort of spam message promoting crap like the ones that fill our inboxes lately, with links to their site, in order to (I believe) raise their ranking in Google or something like that. They have been sitting there for some days – I did not closely monitor the server for a while – I just relocated, and been busy with the new job and everything.
You really can't leave them a lone a minute!

I have read of people writing this sort of crap on wikis too, and I just don't get why people should be so lame to use a public faciliy to write their crap. Possibly is the same sort of people who writes on walls…..

The logs were reporting the posts happened from two IP addresses:
38.119.107.88 and 213.91.217.78.

I now cleared them. If they continue I will have to deactivate the possibility for people to answer/comment to posts…. which would be a pity.