Tag Archives: wordpress

WordPress SEO Sitemap and Heroku

Maybe you have read at a previous post about Heroku and WordPress

that we have been using PostgreSQL as a persistent storage for this current blog.

Aside from the problems with our hosting as it is, from the fact that we could not (and still cannot) run add-on PG4WP effectively with Heroku and WordPress, we found another rather serious problem.

Last night I was browsing Google’s Webmaster Tools

and I found out that my sitemap was not working properly.

Despite the fact that the link was loading (http://www.must-feed.com/sitemap_index.xml) if you got in and tried to load the posts sitemap (http://www.must-feed.com/post-sitemap.xml) it responded with a Not Found (404) page.

Post Sitemaps was Not Found (404)

Looking around the Heroku logs (in terminal write heroku logs) and I found out this really interesting error :

[error] WordPress database error ERROR: date/time field value out of range:
"0000-00-00 00:00:00"\nLINE 1: ...ssword = 'xxasdf' AND post_author
!= 0 AND post_date != '0000-00-0...\n
^ for query 
   SELECT COUNT(ID) FROM wp_posts
   WHERE post_status IN ('publish','inherit') AND
   post_password = '' AND post_author != 0 
   AND post_date != "0000-00-00 00:00:00" AND post_type = 'post' 
made by require('wp-blog-header.php'), wp, WP->main, 
WP->query_posts, WP_Query->query, WP_Query->et_posts, 
do_action_ref_array, call_user_func_array, WPSEO_Sitemaps->redirect, 
WPSEO_Sitemaps->build_sitemap, WPSEO_Sitemaps->build_post_type_map

I digged the code a bit and found out

that the problem was being caused by a query which was ran from WordPress SEO Yoast Plugin.

This query :

SELECT COUNT(ID) FROM $wpdb->posts {$join_filter} 
WHERE post_status IN ('publish','inherit') AND post_password = '' 
AND post_author != 0 AND post_date != "0000-00-00 00:00:00" AND post_type = %s

Had an invalid date for PostgreSQL database.

Not to mention that also, the guy who wrote PG4WP

the module that connects Postgres with WordPress (to be honest that is heavily resource consuming) had thought of sanitizing the query from these cases only at INSERTS:

$sql = str_replace( "0000-00-00 00:00:00", "'now() AT TIME ZONE 'gmt'", $sql);

EDIT: I found out that the above line while writing this post
was SQL Injecting the code which was escaped by  the back slashes of gmt and caused again Postgres to fail so I changed it properly. The ajax action of post creation was calling the wp_insert or wp_update method which in turn was trying to insert a new post. I will come back with another update since I cannot escape single quotes somehow…

And not on SELECTS and UPDATES.

So all I had to do was add a new line at 290

of file : /wp-content/pg4wp/driver_pgsql.php

$sql = str_replace("0000-00-00 00:00:00", "1977-01-01", $sql);

to sanitize the SELECT and the posts sitemap page (and xml) was being loaded properly:

EDIT: I have also found out that the same bug applies with the press_this.php functionality and you cannot use it (you need to sanitize this as well).

EDIT 2: There are numerous errors during rewrite from different modules. I will come up with my patches at a new github repo. Recently I found out that Jetpack also had a problem, more on that later.

Not Found error disappeared and the Posts entries were generated in the xml.
 Finally after all those I have to say

that the way WordPress uses the Database is the least RIDICULOUS.

Having worked with many systems on web, I suggest to move to a more database agnostic framework, such as pdo_mysql.