WordPress’s lastBuildDate in RSS Feed – gotcha’s

It has been ages since I wrote some technical stuff over here.

I recently stumbled into what apparently is a long-standing issue in WordPress, to the point that a lot of people decide to just implement their own, separate, RSS feed (there are a lot of articles out there on how to do that but I didn’t like such a solution).
The issue is that it doesn’t advance the ‘lastBuildDate’ field in the field often enough and we also didn’t like the logic that it uses to calculate it. In fact, by default, WordPress has logic to pull the last ‘modified’ date rather than the ‘published’ date of a post, then it also includes dates from comments… not what we needed since I am working for a news website where publication is almost always scheduled (which means the ‘modified’ date could be even a day older than the ‘publish’ date) and where comments are disabled!

More importantly, this was affecting how often Google actually processed their feed – not fast enough because it thought it was stale/not updated – and this severely impacted their chance to be found in Google News.

Here’s my solution (code to add to your functions.php or a custom plugin):

/* speed up feed cache */
add_filter( 'wp_feed_cache_transient_lifetime', create_function('$a', 'return 600;') );

/* overrides/changes the way 'lastBuildDate' is calculated for the standard wordpress RSS feed */
function my_lastpostmodified()
{
    global $wpdb;
    $add_seconds_server = date('Z');
    $lastDateSql = $wpdb->get_var("SELECT DATE_ADD(post_date_gmt, INTERVAL '$add_seconds_server' SECOND) FROM $wpdb->posts WHERE post_status = 'publish' ORDER BY post_date_gmt DESC LIMIT 1");
    $lastDate = date_create($lastDateSql);
    return $lastDate->format('r');
}

add_filter('get_lastpostmodified', 'my_lastpostmodified');
add_filter('get_feed_build_date', 'my_lastpostmodified');

Sources that helped me come up with this solution:

Backup or Store stuff to GMail via IMAP in Ruby

Once upon a time, I used to store some automated small backups into GMail just by having the scheduled backup send an email to my GMail account. At one stage they blocked me from doing so, marking those repeated email as SPAM.

After that, I took a different approach: I kept sending the mail on the SAME server as the backup, and using IMAP I could DRAG-and-DROP the backup attachment from the mailbox on one server to the mailbox on another server (=GMail). They did not mark me as a spammer that way, of course.
So that worked for a while, but then I got tired of doing this manually.

So the following ruby script is the way I automated the “move offsite” part of that backup.
For completeness, I will give the due credits about who set me on the right track: I started off by this example by Ryan.

#!/usr/bin/env ruby
begin_ = Time.now

#includes
require 'net/imap'

##Source Info
$SRCSERVER="mail.muscetta.com"
$SRCPORT=143
$SRCSSL=false
$SRCUSERNAME="daniele"
$SRCPASSWORD=""
$SRCFOLDER="INBOX.Backups"

##Destination Info
$DSTSERVER="imap.gmail.com"
$DSTPORT=993
$DSTSSL=true
$DSTUSERNAME="muscetta@gmail.com"
$DSTPASSWORD=""
$DSTFOLDER="Backup"

#connect to source
puts "connecting to source server #{$SRCSERVER}... nn"
srcimap = Net::IMAP.new($SRCSERVER,$SRCPORT,$SRCSSL)
srcimap.login($SRCUSERNAME, $SRCPASSWORD)
srcimap.select($SRCFOLDER)

#connect to destination
puts "connecting to destination server #{$DSTSERVER}... nn"
dstimap = Net::IMAP.new($DSTSERVER,$DSTPORT,$DSTSSL)
dstimap.login($DSTUSERNAME, $DSTPASSWORD)
dstimap.select($DSTFOLDER)

# Loop through all messages in the source folder.
uids = srcimap.uid_search(['ALL'])
if uids.length > 0
	$count = uids.length
	puts "found #{$count} messages to move... nn"

	srcimap.uid_fetch(uids, ['ENVELOPE']).each do |data|
		mid = data.attr['ENVELOPE'].message_id

		# Download the full message body from the source folder.
		puts "reading message... #{mid}"
		msg = srcimap.uid_fetch(data.attr['UID'], ['RFC822', 'FLAGS', 'INTERNALDATE']).first

		# Append the message to the destination folder, preserving flags and internal timestamp.
		puts "copying message #{mid} to destination..."
		dstimap.append($DSTFOLDER, msg.attr['RFC822'], msg.attr['FLAGS'], msg.attr['INTERNALDATE'])

		#delete the msg
		puts "deleting messsage #{mid}..."
		srcimap.uid_store(data.attr['UID'], '+FLAGS', [:Deleted])
		srcimap.expunge

	end

	#disconnect
	dstimap.close
	srcimap.close
end

total_time = Time.now - begin_
puts "Done. RunTime: #{total_time} sec. nn"

Searching for myself on various search engines

Searching for myself on Yahoo Image Search

Searching for myself on Yahoo Image Search, uploaded by Daniele Muscetta on Flickr.

Here I start a quick comparison of what search engines actually find about me.
I am glad to read that Live Search can find Jimi Hendrix’s face, and Google can spot those portraits of Paris Hilton.
Unfortunately I am not as famous as them, so not enough people have tagged me. Not on “normal” web pages or newspaper.

Yahoo did a great/smart thing buying Flickr.
It gets people doing the TAGGING for them.
So the results are accurate for pretty much everything.

Ok granted. All of these pictures are coming out of Flickr.
But while that is a limitation, it is also its power.

This is also why I was able to search for “blackberries” the other day and find the thing I was searching for, that is FRUIT that grows spontaneously in the woods, rather than a bunch of stupid mobile telephones.
try: images.search.yahoo.com/search/images?p=blackberry+OR+fruit

Doing the same search on Google:

Searching for myself on Google Image Search

Ok this is not all from flickr anymore, they actually have the rest of the web in their database. Most of them are pictures I made – granted. But only one OF me, and definitely not the first one. Ninth position.

try the blackberry serch images.google.com/images?svnum=10&q=blackberry+OR+fruit

And now Live Search:

Searching for myself on Live Image Search

Same as Google: images from everywhere. Less images than Google. Most of them made by me (not all). An actual picture of myself is in 9th position.

my blackberry search here finds a lot of fruit…

blackberry_live

strangely enough, there’s an IPhone among them!!!!

Google has pissed me off this week!

Now I pretty much liked GMail and Google in general. But this time they REALLY pissed me off! I will tell you that I am not a google-hater even if I work for a competing company. Of course not everything that Google does is wonderful, but some of their services are really cool and useful and I have never denied to say they rocked when I felt they did.
In general, people seem to love them, and their stock value shows it (with the launch of “Code Search” this week they made a lot of people scream “how cool is this” so that they got back from just under 400 dollars to 417!). But that’s not the issue. That is cool, that works. It’s ok they make money if they make cool tools. It’s fine for me.

In fact i consider GMail as being one of the best interface for reading mail that exist out there – I love “tagging” (oops: it’s called “labelling” in their syntax), speed of search through messages (even tough Outlook 2007 is faster on indexed content, but still you have to buy it and install it on your PC)… I also especially love the way it shows THREADING… so that I moved pretty much EVERY mailing list I read on their account:

Ma come se fa ?
(ok, they could do better with the localized version of “Re:” in replies…. in Italian a lot of broken MUA’s translate that into “R:” and that isn’t understood by GMail and will make it think it is another thread…. but that’s a minor issue, and also one that every MUA handling threading has – including “mutt” – the real problem is the broken MUAs sending the “R:” in the first place. But I digress too much….).

I also keep GMail continuosly opened in a browser during the day because a lot of informative mail and that sent by friends goes there. This to say that I do get a lot of their ads (that is – the point of having such an application, for them…). On the contrary, Windows Live Mail reduced its ads to show only one… not to annoy you too much.
But the ads in GMail were not *really* a problem (I don’t read them anyway, I just plain IGNORE THEM).

But this week they REALLY pissed me off. They REALLY have. And here is the reason:
I have been using a script for MONTHS to backup my database (the one powering THIS blog) and send it “off-site” to my GMail mailbox. Pretty much something like a lot of other people do, described in various articles and blog posts. Then I was labelling them with a rule, so that I could access my backups easily in case I needed them.

Now I don’t know if this violates their terms of use in any way… because I am not really using it as storage with those programs that circulated at one stage that had “reverse engineered” it. Those were bypassing the web interface altogether so people did use it as storage with a program without having to see their ads. That was the issue, I think. In my case, I am just sending MAILS to myself. One per day. I also delete the old ones every now and then, and they are not even huge in sized (attachments of 40 to 50KB so far!!)… anyway, I know a lot of people that store documents and all sort of stuff even in their corporate mailboxes in Outlook (then maybe index them with Windows Desktop Search of Google Desktop to find it back)… I was only doing the same with GMail. I don’t see the big issue here….. they might think otherwise…. but from what happens I don’t think that’s the issue.

Anyway, now it’s been three or four days that my backup mail gets rejected. My SMTP Server gets told:

host gmail-smtp-in.l.google.com[66.249.83.27] said:
550-5.7.1 Our system has detected an unusual amount of unsolicited
550-5.7.1 mail originating from your IP address. To protect our
550-5.7.1 users from spam, mail sent from your IP address has been
550-5.7.1 rejected. Please visit
550-5.7.1 http://www.google.com/mail/help/bulk_mail.html to review
550 5.7.1 our Bulk Email Senders Guidelines.

Now for fuck’s sake. You know how much I hate SPAMMERS and what I would like to do with them. But I also know that it does happen to end up in RBLs and such sometimes. Fine. But GIVE ME a way to tell you that I am NOT one! If you go to the link above, all you find is a form where you can specify that mail that ended up in your “junk” folder actually wasn’t spam. Yeah, right. In my case it does not even go into my “junk” folder! How am I supposed to give me the original header that arrived to THEM if I only have the one sent by my mailserver ? They just blacklisted my mail server’s IP Address! As they say, I even have an SPF record, I always use the same address, etc….
So I tried to fill in the form, the day after I also tried to contact their abuse@google.com and abuse@gmail.com addresses.
Still nothing.
They even tell you (in the automated reply when you contact “abuse”:
“[…] For privacy and security reasons, we may not reveal the final outcome of an abuse case to the person who reported it. […]”.
How great. How am I supposed to know if they even READ my complaint ?

You anti-spam people at GMail: “I am NOT a fucking spammer!!!!!”. I ‘haven’t found a better way to tell ya this, you know, than writing it on my blog… this is just RIDICULOUS!

But to date my mails still get dropped. I’ll probably have to send my backups somewhere else. At this point they pissed me off so much that I am also seriously considering getting back to use my own mailserver also for receiving and reading my mailing lists. Then I won’t get ads there.
Afzetterij!
(I hope you have some dutch guy on board at Google, as “Google Translate” does not translate from/to dutch yet…. )

Edited on October, 8th – While GMail REJECTS those mails (it SAYS it is not accepting them), Hotmail simply DROPS them (that is: it does not even SAY it is not accepting them):

to=, relay=mx4.hotmail.com[65.54.245.104], delay=3, status=sent (250 <20061008061010.GA19807@muscetta.com> Queued mail for delivery)

This way you THINK it is going to be delivered, but it NEVER shows up in your inbox. I don’t know who’s behaving the worst…

Annoying spammer and lame defacers – part three

I am sorry but, since they did it again, I have removed the possibility to post HTML tags into a comment – this way, if the reason of their idiotic comments was that of increasing their ranking in Google, this won’t at least be accomplished.

I prefer to leave anonymus posting capabilities to my visitors, but I don’t like helping spammers doing their crap.

Annoying spammers and lame defacers

I just realized that some guy, believing to be funny, used my “comment” link under the blog posts to fill in a lot of crap – some sort of spam message promoting crap like the ones that fill our inboxes lately, with links to their site, in order to (I believe) raise their ranking in Google or something like that. They have been sitting there for some days – I did not closely monitor the server for a while – I just relocated, and been busy with the new job and everything.
You really can’t leave them a lone a minute!

I have read of people writing this sort of crap on wikis too, and I just don’t get why people should be so lame to use a public faciliy to write their crap. Possibly is the same sort of people who writes on walls…..

The logs were reporting the posts happened from two IP addresses:
38.119.107.88 and 213.91.217.78.

I now cleared them. If they continue I will have to deactivate the possibility for people to answer/comment to posts…. which would be a pity.

On this website we use first or third-party tools that store small files (cookie) on your device. Cookies are normally used to allow the site to run properly (technical cookies), to generate navigation usage reports (statistics cookies) and to suitable advertise our services/products (profiling cookies). We can directly use technical cookies, but you have the right to choose whether or not to enable statistical and profiling cookies. Enabling these cookies, you help us to offer you a better experience. Cookie and Privacy policy