Let me preface this by saying this script (at the bottom of this post) is not my work. A friend wrote it, and we found that it’s not working as we’d expect. I have zero knowledge of SQL.
Summary:
This SQL script (below) is supposed to produce results for tables with the char
, text
, tinytext
, mediumtext
, longtext
, or varchar
data-types. Instead, it is only producing results for varchar
.
Intended purpose of this script:
When run on information_schema
, generate a series of SQL commands that will, on each column:
- Cast the data as latin1
- Convert the data to binary
- Convert the binary data back to text, encoded in the UTF-8 character set.
I am trying to fix data in an old WordPress database, as instructed here. This script is supposed to automatically generate the SQL script I will use to convert all my columns.
Expected results when run:
Output that looks something like this for all tables with a data-type of char
, text
, tinytext
, mediumtext
, longtext
, or varchar
:
[...]
alter table products change productName productName /*1*/ VARCHAR(70) character set latin1;
alter table products change productName productName /*2*/ VARBINARY(70);
alter table products change productName productName /*3*/ VARCHAR(70) character set utf8;
alter table products change productDescription productDescription /*1*/ LONGTEXT character set latin1;
alter table products change productDescription productDescription /*2*/ LONGBLOB;
alter table products change productDescription productDescription /*3*/ LONGTEXT character set utf8;
[...]
Actual results when run:
The script, when run on information_schema via phpMyAdmin, only creates commands for columns with the varchar
data-type. Any other data-type (such as text
or longtext
) is ignored; no commands are created for tables with that data-type.
[...]
alter table customers change addressLine1 addressLine1 /*1*/varchar(50) character set latin1; /* varchar latin1 */
alter table customers change addressLine1 addressLine1 /*2*/varbinary(50); /* varchar latin1 */
alter table customers change addressLine1 addressLine1 /*3*/varchar(50) character set utf8; /* varchar latin1 */
alter table customers change addressLine2 addressLine2 /*1*/varchar(50) character set latin1; /* varchar latin1 */
alter table customers change addressLine2 addressLine2 /*2*/varbinary(50); /* varchar latin1 */
alter table customers change addressLine2 addressLine2 /*3*/varchar(50) character set utf8; /* varchar latin1 */
[...]
Database used for testing:
The script:
select concat( 'alter table '
, table_name
, ' change '
, column_name
, ' '
, column_name
, ' /*1*/'
, data_type
, case
when data_type = 'varchar' then concat( '(', character_maximum_length, ')' )
end
, ' character set latin1; /* '
, data_type, ' ', character_set_name
, ' */' ) as cmd
from columns
where table_schema = 'classicmodels'
and data_type in ('char','text','tinytext','mediumtext','longtext','varchar')
UNION
select concat( 'alter table '
, table_name
, ' change '
, column_name
, ' '
, column_name
, ' /*2*/'
, case when data_type = 'char' then 'binary'
when data_type = 'text' then 'blob'
when data_type = 'tinytext' then 'tinyblob'
when data_type = 'mediumtext' then 'mediumblob'
when data_type = 'longtext' then 'longblob'
when data_type = 'varchar' then 'varbinary'
end
, case
when data_type = 'varchar' then concat( '(', character_maximum_length, ')' )
end
, '; /* '
, data_type, ' ', character_set_name
, ' */' ) as cmd
from columns
where table_schema = 'classicmodels'
and data_type in ('char','text','tinytext','mediumtext','longtext','varchar')
UNION
select concat( 'alter table '
, table_name
, ' change '
, column_name
, ' '
, column_name
, ' /*3*/'
, data_type
, case
when data_type = 'varchar' then concat( '(', character_maximum_length, ')' )
end
, ' character set utf8; /* '
, data_type, ' ', character_set_name
, ' */' ) as cmd
from columns
where table_schema = 'classicmodels'
and data_type in ('char','text','tinytext','mediumtext','longtext','varchar')
order by 1;
What do I need to change in the script to make it properly run on all data-types?
Thank you.
This is an interesting script.
It appears that you’re not getting the non-
VARCHAR()
types because theCASE
statements whichCONCAT()
the varchar’scharacter_maximum_length
do not have anELSE
case, and I think those rows are therefore being excluded from theUNION
output. A quick test shows that I can add anELSE ''
to return an empty string whencase when data_type = 'varchar'
is not matched:For example:
You’ll need to add that in a few places, three in all. The whole script becomes:
I loaded your test database and ran the test script against
information_schema
. It produced the following output (abbreviated):