From Drupal 5.10 to WordPress 2.9.1

I decided to switch from my aged Drupal CMS to WordPress because my website doesn’t really need all the flexiblity that Drupal offers. I also hope to make maintenance a bit easier — time will tell how that works out. Anyway, this wasn’t a very rational decision that I researched a lot. Maybe I just wanted to try WordPress and see how it works.

This post mainly is a reference for me to help me remember what I did, but it may be useful for others too…

My first concern was to import my current content from Drupal, which isn’t really all that much if it wasn’t for the 800+ comments on my Wiimote Whiteboard page. A quick search revealed this WordPress Codex page for tips on migration, which led me to the most recent of those articles (still about a year old) that deals exactly with migration from Drupal 5 — so I was happy and got to work. I did as instructed in that article and dumped the SQL commands to phpMyAdmin to migrate the data. I don’t know if things have changed in either Drupal or WordPress since that post was written, but I encountered a number of problems with those SQL commands.

I won’t reprint the SQL script from that article here, you can download my updated version here. I’ll just go over the changes that I made (highlighted in blue) and why I did them (some of them have been mentioned in the comments of the article). I didn’t migrate any categories or tags, so be aware that this might not work. I’ll assume in the following that Drupal is installed in the database drupal without table prefix and WordPress is installed in the database wordpress with the table prefix wp_.

Migrating posts

I also copied the author’s user ID, which is no problem for my single user blog. Also change the last line of the commands to the type of posts that should be considered, your milage may vary. The original type post is probably wrong, because blog posts in Drupal are usually either of type blog or story.

# posts; keeping private posts hidden
INSERT INTO wordpress.wp_posts (id, post_author, post_date, post_content,
 post_title, post_excerpt, post_name, post_modified, post_type, `post_status`)
SELECT DISTINCT
 n.nid `id`,
 n.uid `post_author`,
 FROM_UNIXTIME(n.created) `post_date`,
 r.body `post_content`,
 n.title `post_title`,
 r.teaser `post_excerpt`,
 IF(SUBSTR(a.dst, 11, 1) = '/', SUBSTR(a.dst, 12), a.dst) `post_name`,
 FROM_UNIXTIME(n.changed) `post_modified`,
 n.type `post_type`,
 IF(n.status = 1, 'publish', 'private') `post_status`
FROM drupal.node n
INNER JOIN drupal.node_revisions r
 USING(vid)
LEFT OUTER JOIN drupal.url_alias a
 ON a.src = CONCAT('node/', n.nid)
WHERE n.type IN ('page', 'blog')

# blog posts are of type 'post' in wordpress
UPDATE wordpress.wp_posts SET post_type = REPLACE(post_type, "blog", "post");

Migrating comments

I wanted to preserve the thread structure of my existing comments, a feature which is unfortunately handled differently by the two systems. Drupal uses a character sequence in the field thread and WordPress uses the comment_ID of the parent node in the field comment_parent. Example:

ID   Structure   thread (Drupal)   comment_parent (WordPress)

 1   A           01/               
 2   ⌊ B         01.00/            1
 4     ⌊ D       01.00.00/         2
 5     ⌊ E       01.00.01/         2
 3   ⌊ C         01.01/            1

The original SQL script actually copies the content of thread directly to comment_parent, which not only is semantically wrong but even does harm since it leads to non-existent comment references — which caused things like pagination of comments to break for me. Converting this cannot be handled by simple SQL statements, so I wrote a small PHP script that does the conversion (you can download it here):

<pre>
<?
  // define all those variables...
  $connection = mysql_connect($mysqlhost, $mysqluser, $mysqlpwd) or die("connect error");
  mysql_select_db("drupal", $connection) or die("select db error");


  function thread_comment($nid,$hmap,$cid,$pre) {
    $cond = isset($pre) ? "thread LIKE '".substr($pre, 0,-1).".%/' AND CHAR_LENGTH(thread) = ".(strlen($pre)+3)."" :
                          "CHAR_LENGTH(thread) < 5";

    $sql = "SELECT cid, nid, subject, timestamp, thread, name FROM comments ";
    $res = mysql_query($sql."WHERE nid=$nid AND $cond");

    while ($arr = mysql_fetch_array($res)) {
      print "$arr[thread]: $arr[subject]\n";
      $hmap[$arr[cid]] = $cid;
      $hmap = thread_comment($nid,$hmap,$arr[cid], $arr['thread']);
    }
    return $hmap;
  }

  $hmaps = array();

  $res = mysql_query("SELECT nid FROM comments WHERE 1 GROUP BY nid");
  while ($arr = mysql_fetch_array($res)) {
    print "NODE ID $arr[nid]:\n\n";
    $hmaps[$arr[nid]] = thread_comment($arr[nid],array());
    print count($hmaps[$arr[nid]])."\n\n";
  }

  mysql_select_db('wordpress', $connection) or die("select db error");

  foreach($hmaps as $nid => $hmap) {
    foreach($hmap as $child => $parent) {
      mysql_query("UPDATE wp_comments SET comment_parent = $parent WHERE comment_ID = $child AND comment_post_ID = $nid");
    }
  }
?>
</pre>

There were several other things to improve. The user ID and IP address of the commenter weren’t copied, and comment_approved was set incorrectly; not setting the field comment_date_gmt can also cause problems. Additionally, comments have subjects in Drupal but not in WordPress; I simply added them at the beginning of the comment in a div-block with class comment-subject for styling via CSS. Here are the modified SQL commands:

# comments; keeping unapproved comments hidden
INSERT INTO wordpress.wp_comments (
 user_id, comment_ID, comment_post_ID, comment_date, comment_date_gmt,
 comment_content, # comment_parent,
 comment_author, comment_author_email,
 comment_author_url, comment_author_IP, comment_approved )
SELECT
 uid, cid, nid, FROM_UNIXTIME(timestamp), FROM_UNIXTIME(timestamp),
 CONCAT('<div class="comment-subject">',subject,"</div>\n\n",comment), # thread,
 name, mail, homepage, hostname, NOT(status)
FROM drupal.comments;

# update comments count on wp_posts table
UPDATE wordpress.wp_posts SET `comment_count` = (SELECT COUNT(`comment_post_id`)
FROM wordpress.wp_comments
WHERE `wp_posts`.`id` = `wp_comments`.`comment_post_id`);

Note that I also copied the (auto-incremented) comment ID, because I wanted to keep my existing permalink structure /comment/[comment_id] to link to individual comments.

I wrote the small script comment-redirect.php and put it in the WordPress base directory, then forwarded from /comment/(\d+) to /comment-redirect.php?cid=$1 by using the great Redirection plugin.

<?
  // comment-redirect.php
  $comment_ID = $_GET['cid'];
  require('./wp-blog-header.php');
  header('location: '.get_comment_link($comment_ID));
?>

A cleaner solution would obviously be to write a plugin for that, but I actually had to change quite a bit of core WordPress code to get comments working (the way I wanted). Time for another blog post I guess…