Now With More UTF (Converting Your WordPress Database to UTF-8)

by Kabitzin on September 3, 2007 in Blogging Tips,Site News

Unsatisfied with the previous 7 Unicode Transformation Formats, I recently converted the database to UTF-8. Before I get into the gory details, here’s all you need to know:

Back up your database, install this plugin, read the instructions, and follow the instructions. Everything will be done with the push of one button. There is also a sanitizer if something goes wrong, but I have not tried this plugin.

Edit: According to the WordPress Codex, the above plugin does not work properly on newer versions of WordPress. You will want to backup your database tables and take the long route very carefully. The sanitizer will still work as a “last resort”.

Now the real motivator to convert came from JP Meyer’s post about the annoyance of converting. Basically it comes down to the fact that using UTF-8 will make your blog easily portable, and old versions of WordPress did not use UTF-8. The solution sounds really bad, but since it can be automated, someone finally came up with a plugin to do everything for you.

Now With More UTF (Converting Your Wordpress Database to UTF 8) utf02

Conversion to UTF-8 is definitely an issue for anime bloggers, because many of us use foreign characters in our posts, titles, categories, etc. Everything will seem to work until you try to move your database (such as when JP tried to move his databases to a new webhost), at which point all the old Latin1 stuff will look like garbled crap. After you have converted all your data once, you should not have to ever do it again, as new versions of WordPress are set to use UTF-8 by default. I’m not sure exactly how you can tell everything was converted correctly, but I did notice that my blogroll order (alphabetical) changed when I refreshed the main page after I completed the conversion. So for JP’s hard work, he is now listed last on the blogroll orz.

I remember hearing about the idea first on Jason’s update post. I went to the link he included, started reading, and immediately thought, “OMG this sounds really painful, I’ll do it later.”

Now With More UTF (Converting Your Wordpress Database to UTF 8) utf01

Beginning with Version 2.2, WordPress allows the user to define both the database character set and the collation in their wp-config.php file. Setting the DB_CHARSET and DB_COLLATE values in wp-config.php causes WordPress to create the database with the appropriate settings. But, the setting can only be designated for new installations, not for ‘already installed’ copies of WordPress.

I had been using the old wp-config.php file and so I actually had to download the newest WordPress, and use the new version of the wp-config.php that tells WP to use UTF-8. Then I used the WP-DBManager to back up the database. After this step, you can either do things the hard way, or the easy way. I picked the easy way, and if you want some background info about this plugin, here’s the forum thread. Keep in mind, the converter plugin only works for versions 2.1.x and 2.2.x of WordPress. The conversion does take a little bit of time, depending on the size of your database, so I cleaned up my database (dropping unneeded information in various plugin-created tables) to cut down on the amount of information that the plugin needed to convert.

Edit: As the conversion plugin has not been updated since, those who are converting from a post-2.1 or 2.2 version of WordPress will get database corruption errors from the plugin. Unfortunately, you are currently stuck using the long way of converting your database until someone comes up with a working plugin.

Related posts:

  1. New Contact Form
  2. WordPress 2.7 Preview
  3. Find And Replace

This post was written by...

– who has written 1935 posts on Sea Slugs! Anime Blog.

One of the founders of Sea Slugs, I handle most of the blog admin tasks while wearing my I AM BOSS shirt. I like my action series well choreographed, and my romance series extra trashy. I also have a soft spot for puns.

{ 2 trackbacks }

{ 10 comments… read them below or add one }

Stripey September 4, 2007 at 12:44 am

Those lolicious pictures actually gave me (an IT-challenged tanuki) the courage to want to try this potentially catastrophic conversion.

I need to lay off Moetan. XD

Reply

Kabitzin September 4, 2007 at 12:49 am

Look into those sad loli eyes and know that she can’t convert that database without you.

While it sounds like a lot can go wrong, the actual process was pretty easy with the plugin. I don’t want to think about what it was like before the plugin, or I’ll start crying too.

If your database blows up though, I want Zyl to know I had nothing to do with it.

Reply

0rion September 4, 2007 at 2:51 am

tl;dr…but those pictures were great.

Just kidding, good luck with that, hope your site does not asplode.

Reply

Hinano September 4, 2007 at 6:23 am

Thanks for the information ^^ I’ll let JP know, he’s been tinkering with this crap all weekend.

Reply

Kabitzin September 4, 2007 at 7:58 am

Hinano, I’m not sure of how JP originally backed up the data, so he may want to read some of the gory details of how the conversion works.

First, make a backup of your database. This doesn’t mean do a mysql dump (which will get you a bunch of unusable rubbish), it means make a PHYSICAL COPY of the database folder, which you can find on your server at:

$ cd /some/where/mysql/data/mydatabasename

Otherwise it sounds like he will be unable to simply run the plugin to convert the text. I have heard bad things about Dreamhost support, but if he does not have a physical copy, perhaps Dreamhost has a backed up version on their servers. It may be worth asking them.

Reply

Hinano September 4, 2007 at 9:03 am

He told me he backed everything up XD I got an email from his this morning telling me he was using the same file and said thanks to your post he’ll be able to get everything rolling by tonight – which is good cause I wanna do a sleezy school days post! ^_^

Reply

Kabitzin September 4, 2007 at 9:11 am

Tell him to hurry up, because I want to read your sleazy School Days post, lol =D. When stripped of kruft, our database was about 7 Mb, and took maybe 30 seconds to convert.

Reply

Kabitzin September 4, 2007 at 12:34 pm

I should mention that so far, Chris has helpfully pointed out one comment that got messed up in the conversion. It was an apostraphe that got converted into three characters of gibberish.

So far, in a few cases where commenters used the apostraphe, ellipses , or quotes, the symbol got changed to junk. So I guess the plugin is not perfect. However, from what I can tell, the plugin worked correctly in most cases (in posts, in links, etc.).

Reply

Jesus159159159 September 4, 2007 at 3:10 pm

*stares at loli pictures* Well, Mr. Kabitzin, looks like you made me turn my PC into a MAC again! *BA DUM CHING!*…

Anyways, if I even make a blog/website, I’ll make sure to visit these helpful guides again! :)

Reply

jpmeyer September 4, 2007 at 8:40 pm

You know, last night I was juuuuuuuuust about to use that plugin before my DNS got screwed up and I couldn’t access my original host. Seeing this post was helpful since it let me know that it does all the work for me. The DNS issue straightened itself up this morning and I ran the plugin tonight with zero problems. Yippee!

Reply

Leave a Comment

Previous post:

Next post: