So, what is a good way to convert to true utf8 character set with a unicode collation.
Rebuild the data! This is the way that I'm doing it to also shrink the innodb datafile. There is a lot of free space.
Now remember we have utf8 data in a latin1 container column, so we need to make sure that we get a dump of the raw data. MySQL will double pad the data with default options converting the latin1 charset to utf8, which is garbage.
mysqldump -hHOST -uroot -p DB TABLE --no-create-info --compatible=mysql323
This statement will dump the raw data, the option that tells mysqldump to do so is mysql323 compatible.
My personal preference is to
mysqldump -hHOST -uroot -p DB TABLE -T /tmp/
What this does is SELECT * INTO OUTFILE '/tmp/TABLE.txt'
This to will get raw data.
Since all of my tables are INNODB, dumping it out sorted by the PRIMARY index is good for re-import via LOAD DATA.
Next we dump the table structure and remove all occurances of latin1 on each of the column definitions.
Finally reimport the data
Start the slave.
So here is a good procedure.
- Stop the Slave
- Stop MySQL
- Put in your my.cnf skip-slave-start
- Start mySQL (we do the above steps to make sure everything is flushed to disk just to be sure)
- Next check that the dump point has enough space
- Dump the data via the method described above for each table
- Dump the table definition
- Stop mySQL
- Destroy the INNODB datafile and log files rm /var/lib/mysql/ib*
- Remove Database directory
- Edit my.cnf and define default startup for the server
# on every connection we tell the server that transfer data as utf8 so it doesn't translate data comming in for us. SET NAMES is equivlent to
# SET character_set_client = utf8; SET character_set_connection = utf8;SET character_set_results = utf8
init-connect="SET NAMES = utf8"
- Start mySQL
- CREATE DATABASE DATABASENAME DEFAULT CHARACTER SET utf8 COLLATE utf8_unicode_ci
- Remove all references to latin1 from the column definition of each table
- At the end of each table definition add ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
- Add the tables to the database
- IMPORT THE DATA
Let me know if you have any questions.