mySQL DBA, Architecture, Dev, Scale, HA, Code : Note about utf8

Monday, August 07, 2006

Note about utf8_bin

utf8_bin collation on UNIQUE columns needs to be used with caution. In latin1

f == F so a duplicate error would be thrown if a user entered in 'F' when 'f' already existed.

With utf8_bin f != F so, make sure to normalize your data before sticking it into the dblayer i.e. put all email addresses, tags, etc in your db as lowercase :)

Thanks for reminding me Peter Z!!

3 comments:

Cesium said...: Are there session settings that screw up utf8 string comparisons? I was inserting a unique varchar field from ruby that was the french spelling for "Hotel" (circumflex over the 'o'), and got a duplicate key error. Which was wierd because we had just executed sql to check for the existance of the field and the application was running with no other processes in the system. Additionall, copying the sql to a mysql terminal session and executing it successfully inserted the row.; Sun Sep 17, 01:21:00 PM
Seun Osewa said...: This comment has been removed by the author.; Thu Oct 30, 02:47:00 AM
Seun Osewa said...: Why use utf8_bin? Is it faster?; Thu Oct 30, 02:48:00 AM