Category Archives: Code

WordPress SEO Sitemap and Heroku

Maybe you have read at a previous post about Heroku and WordPress

that we have been using PostgreSQL as a persistent storage for this current blog.

Aside from the problems with our hosting as it is, from the fact that we could not (and still cannot) run add-on PG4WP effectively with Heroku and WordPress, we found another rather serious problem.

Last night I was browsing Google’s Webmaster Tools

and I found out that my sitemap was not working properly.

Despite the fact that the link was loading (http://www.must-feed.com/sitemap_index.xml) if you got in and tried to load the posts sitemap (http://www.must-feed.com/post-sitemap.xml) it responded with a Not Found (404) page.

Post Sitemaps was Not Found (404)

Looking around the Heroku logs (in terminal write heroku logs) and I found out this really interesting error :


[error] WordPress database error ERROR: date/time field value out of range:
"0000-00-00 00:00:00"\nLINE 1: ...ssword = 'xxasdf' AND post_author
!= 0 AND post_date != '0000-00-0...\n
^ for query 
   SELECT COUNT(ID) FROM wp_posts
   WHERE post_status IN ('publish','inherit') AND
   post_password = '' AND post_author != 0 
   AND post_date != "0000-00-00 00:00:00" AND post_type = 'post' 
made by require('wp-blog-header.php'), wp, WP->main, 
WP->query_posts, WP_Query->query, WP_Query->et_posts, 
do_action_ref_array, call_user_func_array, WPSEO_Sitemaps->redirect, 
WPSEO_Sitemaps->build_sitemap, WPSEO_Sitemaps->build_post_type_map

I digged the code a bit and found out

that the problem was being caused by a query which was ran from WordPress SEO Yoast Plugin.

This query :


SELECT COUNT(ID) FROM $wpdb->posts {$join_filter} 
WHERE post_status IN ('publish','inherit') AND post_password = '' 
AND post_author != 0 AND post_date != "0000-00-00 00:00:00" AND post_type = %s

Had an invalid date for PostgreSQL database.

Not to mention that also, the guy who wrote PG4WP

the module that connects Postgres with WordPress (to be honest that is heavily resource consuming) had thought of sanitizing the query from these cases only at INSERTS:

$sql = str_replace( "0000-00-00 00:00:00", "'now() AT TIME ZONE 'gmt'", $sql);

EDIT: I found out that the above line while writing this post
was SQL Injecting the code which was escaped by  the back slashes of gmt and caused again Postgres to fail so I changed it properly. The ajax action of post creation was calling the wp_insert or wp_update method which in turn was trying to insert a new post. I will come back with another update since I cannot escape single quotes somehow…

And not on SELECTS and UPDATES.

So all I had to do was add a new line at 290

of file : /wp-content/pg4wp/driver_pgsql.php


$sql = str_replace("0000-00-00 00:00:00", "1977-01-01", $sql);

to sanitize the SELECT and the posts sitemap page (and xml) was being loaded properly:

EDIT: I have also found out that the same bug applies with the press_this.php functionality and you cannot use it (you need to sanitize this as well).

EDIT 2: There are numerous errors during rewrite from different modules. I will come up with my patches at a new github repo. Recently I found out that Jetpack also had a problem, more on that later.

Not Found error disappeared and the Posts entries were generated in the xml.
 Finally after all those I have to say

that the way WordPress uses the Database is the least RIDICULOUS.

Having worked with many systems on web, I suggest to move to a more database agnostic framework, such as pdo_mysql.

Upgrading to WordPress 3.9 in Heroku

As you may already know, our blog is using the WordPress Heroku PHP Buildpack.

Now, since April of 2014 Heroku has updated it’s buildpack to support officially PHP and their final version here. They have done a really fine job, compared to the old legacy buildpack which involved a custom build batch file, and a plain empty php file to procreate the Heroku Dyno to run the Apache process with php process. More details from this awesome guy, here : https://github.com/mhoofman/wordpress-heroku and the dyno changes here : https://github.com/xyu/wordpress-heroku .

Our current installation though uses also WordPress.

Since then WordPress has released some considerable updates, and we need to update to the newest version. So, doing the necessary process, downloading WP3.9.2 , extracting it to the current setup, committing, pushing to heroku master.

At first heroku fails. Since by default it uses the new buildpack and requires composer.json to exist, so as to run Composer and configure afterwards the PHP-Heroku-Buildpack. Now, that was my point of failure since I did not notice that you can keep the legacy buildpack (but hey, new is always better right?) and I agreed to take on the new one.

After following the guide here Getting Started With PHP ,  created a composer.json (currently empty since we did not require anything new) and modified Procfile since we needed to explicitly tell Heroku that we had a custom Apache configuration.

A custom Apache Configuration?

Aye. While on the old buildpack you could initialize the stack with php, some WordPress plugins (such as Jetpack, or Zip support) did multiple requests on the server (I assume Ajax ?) and Heroku did not support concurrent requests in the old buildpack. Zip required to include mod_deflate.so. So I found out this guy: https://github.com/xyu/ who wrote a custom script doing nothing more complex than including to the Apache conf his configuration which in turn enabled the concurrent requests. This file was being loaded at startup, when heroku dyno was running.

Searching the Heroku documentation

..and I found out that if you want to include a http.conf  file you have to do it somehow like this :
web: vendor/bin/heroku-php-apache2 -C apache_app.conf
by modifying the Procfile and inserting the custom config file.

Obviously that did not work.

There were warnings that stated that we could not declare ServerLimit inside the VirtualHost section. Which later on made more sense when I saw how New Heroku buildpack integrates apache…

Aside from that, in the Heroku Logs :

There were enough warnings (and a fatal) like these :
2014-04-30T17:59:48.184463+00:00 app[web.1]: [30-Apr-2014 17:59:47] WARNING: [pool www] child 48 said into stderr: "NOTICE: PHP message: PHP Warning: pg_query(): Query failed: ERROR: missing FROM-clause entry for table "session""
2014-04-30T17:59:48.184466+00:00 app[web.1]: [30-Apr-2014 17:59:47] WARNING: [pool www] child 48 said into stderr: "LINE 1: SELECT @@SESSION.sql_mode"
2014-04-30T17:59:48.184468+00:00 app[web.1]: [30-Apr-2014 17:59:47] WARNING: [pool www] child 48 said into stderr: " ^ in /app/wp-content/pg4wp/driver_pgsql.php on line 140"

I’m not a Postgres expert, I’m more familiar with MySQL so, I decided that I should install my blog locally.

Downloaded the Postgres dump, restored in local instance, along with the new buildpack, ran an apache instance, and voila another error:
LINE 1: SELECT @@SESSION.sql_mode
^ in /var/sites/example.com/www/wp-content/pg4wp/driver_pgsql.php on line 133
PHP message: PHP Fatal error: Call to undefined function wpsql_errno() in /var/sites/example.com/www/wp-content/pg4wp/core.php(32) : eval()'d code on line 1531"

That one was mitigated from these guys here : https://vitoriodelage.wordpress.com/2014/06/06/add-missing-wpsql_errno-in-pg4wp-plugin/

When I was sure that everything works as intended, I finally pushed again.

Nothing. 0. Actually 500:

500

Checked the logs and I was getting a VERY  descriptive message :
2014-22-11T03:17:49+00:00 heroku[router]: Error H12 (Request timeout) -> GET / dyno=web.1 queue= wait= service=30000ms status=503 bytes=0

This error was not helping at all. No matter what I tried after that, I kept receiving timeout, as if the php apache process was eating up all the resources of the dyno. Clearly something is very wrong…

Finally, I decided to use the standard legacy buildpack

And reverted everything, after upgrading to the new WordPress, migrating the database locally and updating the cloud afterwards.
I used the legacy buildpack by running in my local heroku env:
heroku config:set "BUILDPACK_URL=https://github.com/heroku/heroku-buildpack-php.git#legacy"

 

I have just recently found out that the guy who wrote the plugin connecting to Postgres (this guy: https://wordpress.org/plugins/postgresql-for-wordpress/ ) has recently stoped developing it. And later, (although I still have second thoughts on How hhvm is compatible with php 5.3 and more specifically 5.3 in WordPress) Xiao who migrated his blog to HH-VM/NginX/MySQL heroku build pack : https://github.com/xyu/heroku-wp .

As soon as I have updated I will gather results to see how it went.

EDIT: It looks like there is an update for PG4WP I have been missing. Trying it and letting you know…
EDIT2: Nothing. The process keeps hanging by returning :
←[0m at=error code=H12 desc="Request timeout" method=GET path="/" host=www.must-feed.com request_id=fcaa61c9-1bbf-48d2-9b3c-797ca14ffa58 fwd="192.168.84.223" dyno=web.1 connect=1ms service=30000ms status=503 bytes=1240

Anger! Unlimited, Unmetered ANGERRRR!

Developer Anger!

Developers often face situations where logic bends. Now this happens quite a few times. Suppose we have an equation with factors that the outcome of this equation is the “bending logic”.

The first factor, and larger at times is the client input. “Yes, we know we asked for that, but now things have changed, so we have to change it” , “What? I did not ask for a red line drawn with a blue pen!” and so on. More on that at The Expert.

The second factor is the programming language we are using. And that is the point were we stand for now.  Third and fourth factors is experience and capability but those are irrelevant on each person so, we wont be discussing them now.

ANGER. WRITING CODE THAT SEEMS TO WORK.

Quite recently I stumbled upon this interesting site :

http://www.commitlogsfromlastnight.com/

Turns out that this guy: @abestanway is parsing repos from github and records each curse in the commit log message. A commit log message is a message we developers use when we are completing a project/code snippet/change/bug in the code of a system we are using.

This guy, took it to another step also. He gathered data and created a presentation, showing the languages with the most curses. He even created graphs of the most cursed and least cursed language:

Curses / Language
Curses / Language

As you can see, Javascript is the first. Undoubtedly this language is the most widely used, so it makes sense for it to have the most curses (and yeah many other stuff you already know like changeling scopes and other Javascript shitty stuff).

He also created graphs with the most used curses:

See that Fuck prevails all!!
See that Fuck prevails all!!

Next, he compared the Most Cursed languages with the most used languages in Github. You can have a look at all the above from the relevant video :

Therefore :

ANGERRRRRRRRRRRR
UNLIMITED ANGERRRRR!

To conclude he points out that the more the curses, the better the code…