###Background
My table looks like this:
id (int), hash_value (text)
with the following settings:
InnoDB
utf8
utf8_unicode_ci
the data I store in the table hash_seeds
which looks like this
id hash_value
-- ---------------------
1 ab87g:nGsqr:7y9H4
2 ab8hu:ngsqr:88hYt
The hash values later become unique identifiers for pens
rows. For example, in the table
id name slug
-- ------ -------------
77 duder nGsqr
the slug nGsqr is part of the hash_value
column.
###Problem
We expect to do lookups in a case-sensitive way, but the collation may prevent that. For example, the pens
table will look like this someday:
id name slug
-- ------ -------------
77 duder nGsqr
78 friend ngsqr
We do a lookup on the pens
table by slug
, and I think we'll have problems if we update or select with the wrong collation. Since the collation is utf8_unicode_ci
, (the _ci
means case-insentive) is case-insenstive, we could conceivibly update or delete the wrong row.
For example, in the hash_seeds
table above, the query
Select hash_value from hash_seeds where hash_value like '%nGsqr%'
returns rows 1 and 2. However, the query
Select hash_value from hash_seeds where hash_value like _utf8 '%nGsqr%' collate utf8_bin
returns only row 1.
###Question
I've tried resetting the collation of that table to utf8_bin
, but without success. The query
Select hash_value from hash_seeds where hash_value like '%nGsqr%'
still returns both rows.
So, is there something I need to do to force the table to understand my collation, or do I have to edit each query to include the _utf8
and collate
indicators?
For the move, we'll
mysqldump
and load the files using the collation set above. i'll pass them through the following sed to change the collation on the columns:Do you see any problems with that? Here's an example table: