- Make sure mysql-client is installed. If not, then :
sudo apt install mysql-client
or
sudo apt-get install mysql-client
-
Open php.ini
; PHP's default character set is set to UTF-8. ; http://php.net/default-charset default_charset = "UTF-8"
Source: https://www.toptal.com/php/a-utf-8-primer-for-php-and-mysql
-
Disconnect all active applications connected to mysql and take a backup of the database.
-
Speak UTF-8 everywhere. Update
mysqld
,mysql
andclient
settings as follows(/etc/mysql/*.cnf
):- On your local machine
[client] default-character-set = utf8mb4 [mysql] default-character-set = utf8mb4 [mysqld] character_set_server=utf8mb4 collation_server=utf8mb4_unicode_ci #The following should be set if you are using mysql version 5.6 or lower innodb_file_format=barracuda innodb_file_per_table=1 innodb_large_prefix=1
Source: https://mathiasbynens.be/notes/mysql-utf8mb4
Source: http://mechanics.flite.com/blog/2014/07/29/using-innodb-large-prefix-to-avoid-error-1071/- On AWS RDS set the following:
[mysqld] innodb_file_format=barracuda innodb_file_per_table=1 innodb_large_prefix=1 character_set_client: utf8mb4 character_set_database: utf8mb4 character_set_results: utf8mb4 character_set_connection: utf8mb4 collation_connection: utf8mb4_unicode_ci collation_server: utf8mb4_unicode_ci character_set_server: utf8mb4
Source: http://aprogrammers.blogspot.in/2014/12/utf8mb4-character-set-in-amazon-rds.html
Source: http://mechanics.flite.com/blog/2014/07/29/using-innodb-large-prefix-to-avoid-error-1071/ -
Restart mysql service
-
Connect to mysql using command line:
mysql -u username -p
-
Run the following query:
SHOW VARIABLES WHERE Variable_name LIKE 'character\_set\_%' OR Variable_name LIKE 'collation%';
Output:
+--------------------------+--------------------+ | Variable_name | Value | +--------------------------+--------------------+ | character_set_client | utf8mb4 | | character_set_connection | utf8mb4 | | character_set_database | utf8mb4 | | character_set_filesystem | binary | | character_set_results | utf8mb4 | | character_set_server | utf8mb4 | | character_set_system | utf8 | | collation_connection | utf8mb4_general_ci | | collation_database | utf8mb4_unicode_ci | | collation_server | utf8mb4_unicode_ci | +--------------------------+--------------------+ 10 rows in set (0.05 sec)
-
Convert your Latin-1 collated tables to UTF-8 a. Individual queries on each table : https://codex.wordpress.org/Converting_Database_Character_Sets b. A mysql dump and restoration of the dump : https://www.bluebox.net/insight/blog-article/getting-out-of-mysql-character-set-hell
-
Take a database dump as follows:
mysqldump --verbose -u root -p --add-drop-database --opt --skip-set-charset --default-character-set=latin1 --skip-extended-insert --databases databasename > dump_file.sql
Note: On the mysqldump command, the --skip-set-charset and --default-char-set=latin1 options should prevent MySQL from taking the already-Latin-1-collated table and “helpfully” converting it to any other character set for you. This should ensure that your mysqldump is really in the Latin-1 character encoding scheme.The --skip-extended-insert option forces mysqldump to put each INSERT command in the dump on its own line. This will make the dump take much longer to re-import, however, in my experimentation, adding this option was enough to prevent the dump from having syntax errors in in anywhere. And in any case, should the re-import fail for any reason, having each row's data on its own line really helps to be able to zero-in on which rows are causing you problems (and gives you easier options to work-around the problem rows).*
-
Replace all latin1 with utf8
perl -i -pe 's/DEFAULT CHARACTER SET latin1/DEFAULT CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci/' dump_file.sql`
```
The first command replaces all instances of DEFAULT CHARACTER SET latin1 with DEFAULT CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci. This is used to fix up the database's default charset and collation.
```perl
perl -i -pe 's/DEFAULT CHARSET=latin1/DEFAULT CHARSET=utf8mb4 COLLATE utf8mb4_unicode_ci ROW_FORMAT=DYNAMIC/' dump_file.sql
```
The second command replaces all instances of DEFAULT CHARSET=latin1 with DEFAULT CHARSET=utf8mb4. This converts all tables from using latin1 to using utf8mb4.
*Source : https://docs.moodle.org/24/en/Converting_your_MySQL_database_to_UTF8#Linux_.26_Mac*
11. Restore the Dump
```
nohup mysql -v -u username -ppassword < dump_file.sql & (to run i background)
or
mysql -v -u username -p < dump_file.sql (to run in foreground)
```
*Source: https://www.maketecheasier.com/run-bash-commands-background-linux/*
12. Repair the tables for any problems in-case
```
mysqlcheck -u root -p --auto-repair --optimize --all-databases
```
13. In your application, execute the following query on your application database and verify the result:
```mysql
SHOW VARIABLES WHERE Variable_name LIKE 'character\_set\_%' OR Variable_name LIKE 'collation%';
```
```
+--------------------------+--------------------+
| Variable_name | Value |
+--------------------------+--------------------+
| character_set_client | utf8mb4 |
| character_set_connection | utf8mb4 |
| character_set_database | utf8mb4 |
| character_set_filesystem | binary |
| character_set_results | utf8mb4 |
| character_set_server | utf8mb4 |
| character_set_system | utf8 |
| collation_connection | utf8mb4_general_ci |
| collation_database | utf8mb4_unicode_ci |
| collation_server | utf8mb4_unicode_ci |
+--------------------------+--------------------+
```
If the result is not as above, pefrom the following steps.
14. [CakePHP] Open database.php and set encoding to utf8mb4 as follows.
```php
class DATABASE_CONFIG {
public $default = array(
'datasource' => 'Database/Mysql',
'persistent' => false,
'host' => 'localhost',
'login' => 'username',
'password' => 'password!',
'database' => 'MySQL Database Name',
'prefix' => '',
'encoding' => 'utf8mb4'
);
}
```
15. [SailsJS] Open connections.js in your SailsJS application and set as follows:
```js
mysql: {
adapter : 'sails-mysql',
host : 'localhost',
port : 3306,
user : 'username',
password : 'password',
database : 'MySQL Database Name'
// OR (explicit sets take precedence)
adapter : 'sails-mysql',
url : 'mysql2://USER:PASSWORD@HOST:PORT/DATABASENAME'
// Add the following
charset : 'utf8mb4',
collation : 'utf8mb4_unicode_ci'
}
```
*Source: https://github.com/balderdashy/sails-mysql#sails-configuration*
Sources
---
- MOST RELIABLE : https://www.bluebox.net/insight/blog-article/getting-out-of-mysql-character-set-hell
- If your database isn't big, also proposes the fastest solution : https://codex.wordpress.org/Converting_Database_Character_Sets
- If you are using AWS-RDS, steps required: http://aprogrammers.blogspot.in/2014/12/utf8mb4-character-set-in-amazon-rds.html
- Why to use MySQL utf8mb4 instead of utf8 - https://mathiasbynens.be/notes/mysql-utf8mb4
- User freindly way to figure out if your database has encodig issues: https://www.whitesmith.co/blog/latin1-to-utf8/
- For PHP Applications : https://www.toptal.com/php/a-utf-8-primer-for-php-and-mysql
- If you are using MySQL version 5.6 or lower : http://mechanics.flite.com/blog/2014/07/29/using-innodb-large-prefix-to-avoid-error-1071/
- https://docs.moodle.org/24/en/Converting_your_MySQL_database_to_UTF8#Linux_.26_Mac
- http://blog.socialcast.com/rails-2-mysql-and-character-sets/