###Background
My table looks like this:
id (int), hash_value (text)
with the following settings:
InnoDB
utf8
utf8_unicode_ci
the data I store in the table hash_seeds
which looks like this
id hash_value
-- ---------------------
1 ab87g:nGsqr:7y9H4
2 ab8hu:ngsqr:88hYt
The hash values later become unique identifiers for pens
rows. For example, in the table
id name slug
-- ------ -------------
77 duder nGsqr
the slug nGsqr is part of the hash_value
column.
###Problem
We expect to do lookups in a case-sensitive way, but the collation may prevent that. For example, the pens
table will look like this someday:
id name slug
-- ------ -------------
77 duder nGsqr
78 friend ngsqr
We do a lookup on the pens
table by slug
, and I think we'll have problems if we update or select with the wrong collation. Since the collation is utf8_unicode_ci
, (the _ci
means case-insentive) is case-insenstive, we could conceivibly update or delete the wrong row.
For example, in the hash_seeds
table above, the query
Select hash_value from hash_seeds where hash_value like '%nGsqr%'
returns rows 1 and 2. However, the query
Select hash_value from hash_seeds where hash_value like _utf8 '%nGsqr%' collate utf8_bin
returns only row 1.
###Question
I've tried resetting the collation of that table to utf8_bin
, but without success. The query
Select hash_value from hash_seeds where hash_value like '%nGsqr%'
still returns both rows.
So, is there something I need to do to force the table to understand my collation, or do I have to edit each query to include the _utf8
and collate
indicators?
Ok, so I've made some progress. I've set the defaults on the server to character set utf8 and collation to utf8_bin. The server looks like this now:
I followed the instructions on stack overflow for the
my.cnf
file.I did have to add the
skip-character-set-client-handshake
directive to get all collations to read utf8_bin. Is that bad?