Differences in Collation

Posts: 57
Joined: 01/23/2008
Bug Finder

Hi!

I am discussing about stock module with TR and we have find an issue on the Collation of some MySQL tables. There seem to appear two different ways of collation in the create tables of Ubercart

COLLATE utf8_unicode_ci
COLLATE utf8_general_ci

This could bring up some errors in heterogeneous joins, as I've experienced with uc_product_adjustments

I think all the tables should have the same collation, for avoiding unexpected behaviors.

Regards
Pedro

Posts: 2008
Joined: 08/07/2007
AdministratoreLiTe!

Hmmm...what's the difference between those two collations? A quick search through the code revealed that all the tables I made just specify the CHARACTER SET UTF8, which I guess uses the default collation. I think I made this decision because that's what Drupal does. I don't think we care which collation we use, so long as we're consistent.

I recommend we take out all of the COLLATE statements in our code.

Posts: 57
Joined: 01/23/2008
Bug Finder

Hi Lyle,

You can find the differences between this collations here:

http://forums.mysql.com/read.php?103,187048,188748#msg-188748

Summing it up, utf8_general_ci is a simpler collation, which is faster and simplifies the text, as in the post above say:

Quote:

For example, these Latin letters: ÀÁÅåāă (and all other Latin letters "a"
with any accents and in any cases) are all compared as equal to "A".

utf8_unicode_ci is slower than the general but offers a extended multilingual support

If it depended of me, i would clearly choose utf8_unicode_ci, although it is slower, for two reasons, multilingual support, and not losing data in possible migrations or upgrades. (If you have A in a table, it doesn't matter the collation used, but if you have a unicode table with Á and you convert it in general, you will reach to A, which to me means to lost info)

Just my two cents Smiling
Regards

Posts: 822
Joined: 11/05/2007
Bug FinderFAQ ModeratorGetting busy with the Ubercode.

And here is a concrete problem caused by tables with different collations:

http://www.ubercart.org/forum/bug_reports/3423/incorrect_collation_table...
http://www.ubercart.org/contrib/2481

--

<tr>.

Posts: 1139
Joined: 08/14/2007
Bug FinderEarly adopter... addicted to alphas.Getting busy with the Ubercode.

Interestingly enough I think this only goes back a few versions in MySQL, I'm using 4.2.3 on our server, and I haven't seen this issue pop up before. I suspect users seeing the collation issues breaking queries are on older versions of MySQL.

--

"Pain don't hurt." - Dalton

Mike Nelson's RiffTrax! www.rifftrax.com