collations.md

###Background

My table looks like this:

id (int), hash_value (text)

with the following settings:

InnoDB
utf8
utf8_unicode_ci

the data I store in the table hash_seeds which looks like this

id  hash_value
--  ---------------------
1   ab87g:nGsqr:7y9H4
2   ab8hu:ngsqr:88hYt

The hash values later become unique identifiers for pens rows. For example, in the table

id  name    slug
--  ------  -------------
77  duder   nGsqr

the slug nGsqr is part of the hash_value column.

###Problem

We expect to do lookups in a case-sensitive way, but the collation may prevent that. For example, the pens table will look like this someday:

id  name    slug
--  ------  -------------
77  duder   nGsqr
78  friend  ngsqr

We do a lookup on the pens table by slug, and I think we'll have problems if we update or select with the wrong collation. Since the collation is utf8_unicode_ci, (the _ci means case-insentive) is case-insenstive, we could conceivibly update or delete the wrong row.

For example, in the hash_seeds table above, the query

Select hash_value from hash_seeds where hash_value like '%nGsqr%'

returns rows 1 and 2. However, the query

Select hash_value from hash_seeds where hash_value like _utf8 '%nGsqr%' collate utf8_bin

returns only row 1.

###Question

I've tried resetting the collation of that table to utf8_bin, but without success. The query

Select hash_value from hash_seeds where hash_value like '%nGsqr%'

still returns both rows.

So, is there something I need to do to force the table to understand my collation, or do I have to edit each query to include the _utf8 and collate indicators?

-- ---------------------------- -- Table structure for `hash_seeds` -- ---------------------------- DROP TABLE IF EXISTS `hash_seeds`; CREATE TABLE `hash_seeds` ( `id` int(11) NOT NULL AUTO_INCREMENT, `hash_value` text CHARACTER SET utf8 COLLATE utf8_bin, `created_at` datetime NOT NULL, `updated_at` datetime NOT NULL, PRIMARY KEY (`id`) ) ENGINE=InnoDB AUTO_INCREMENT=60234 DEFAULT CHARSET=utf8 COLLATE=utf8_bin;

tsabat/collations.md

tsabat commented Sep 3, 2012

Uh oh!