I’m working with an old WordPress database. In the backend, some posts contain characters like:
ü
But they display on the front end as:
ü
Changing define('DB_CHARSET', 'utf8');
in wp-config.php solves this particular issue, however for future compatibility (see: https://make.wordpress.org/core/2015/04/02/the-utf8mb4-upgrade/), I believe it’s a bad idea to leave the database as is and would like to convert it. Entering a ü
directly into the PHP template works without changing anything. So this is clearly a mySQL encoding issue.
I’ve taken the following steps so far:
- Run the following commands from PHPMyAdmin:
ALTER DATABASE MyDb CHARACTER SET utf8;
and
ALTER TABLE wp_posts CHARACTER SET utf8;
and
UPDATE wp_posts SET post_content = convert(cast(convert(post_content using latin1) as binary) using utf8)
(taken from here: MySQL – Convert latin1 characters on a UTF8 table into UTF8)
I’ve also used this tool: https://interconnectit.com/products/search-and-replace-for-wordpress-databases/ to convert all tables to utf-8 mb4 encoding.
alter table wp_posts change post_content post_content LONGTEXT CHARACTER SET latin1;
alter table wp_posts change post_content post_content LONGBLOB;
alter table wp_posts change post_content post_content LONGTEXT CHARACTER SET utf8;
-
I’ve also tried this approach:
http://alexking.org/blog/2008/03/06/mysql-latin1-utf8-conversion -
Completely emptied wp_posts and tried entering a new post.
With all these approaches, when creating a new post and entering an ü
character, it still shows up as ü
.
This thread here also clearly describes the issues I’m having:
https://wordpress.org/support/topic/trouble-converting-database-containing-special-characters-to-utf-8
Any recommendations or insights from here are much appreciated as I’m completely stuck!
Thanks
The database and table declarations are merely defaults for future table and column declarations, respectively.
ALTER TABLE CHANGE ...
changes the declaration without transforming the encoding. That is, this may have made things worse.First, let’s verify what you have. Please provide
SHOW CREATE TABLE
for a table of interest. And provide this for some data in it that should have an accented character such asü
:If the HEX is
FC
, you have latin1 encoding. If the column saysCHARACTER SET latin1
, then they are consistent. (All latin1 encodings are 1 byte.)If the HEX is
C3BC
, you have utf8 encoding. If the column saysCHARACTER SET utf8
(orutf8mb4
), then they are consistent. (utf8 encodings for accented characters are usually 2 bytes.)If the encoding and the charset are not consistent, then panic. You have a mess that needs cleaning up.
When “entering” data, two things are important: (1) the encoding of the character, and (2) what the client thinks the encoding is. The encoding depends on where you got the character (how you typed it, pasted it, etc). You tell the client to tell mysqld what the encoding is in various ways.
define('DB_CHARSET', 'utf8');
is needed for WordPress; something else is needed for phpmyadmin.As long as you configure the client correctly, it does not matter whether the column’s
CHARACTER SET
is the same. Conversion will automatically occur. (However, it is best to be consistent — utf8 (or utf8mb4) at all stages.)ALTER TABLE tbl CONVERT TO CHARACTER SET utf8mb4;
will actively change theCHARACTER SET
and the bytes in all the columns of that table. This will work whether the data is currently latin1 or utf8, but it must be consistent with the declaration (as mentioned above).Please use the code for convert your WordPress database. It is enough to change the appropriate sections in the code for charet…