-
-
Save anarchivist/255187 to your computer and use it in GitHub Desktop.
Changes III's unicode brackets into encoded unicode characters.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<?php | |
/* | |
Changes III's unicode brackets into encoded unicode characters. | |
We have found this more reliable than using the pre-encoded values from the interface. The XRecord will give the bracket output. | |
This presumes you have your database in unicode in iii. iii can convert your database to unicode for you. call the helpdesk | |
If you see codes like {231} without the 'u' then you haven't converted or are entering them in old format | |
Example Input: al-I{u02BB}tir{u0101}f{u0101}t | |
Example Output: Al-Iʻtirāfāt | |
*/ | |
$matches = array(); | |
$string = "al-I{u02BB}tir{u0101}f{u0101}t"; // our example string | |
print "Input: $string\n"; | |
preg_match_all('/\{u[0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F]\}/', $string, $matches); //find all the {} codes | |
foreach ($matches[0] as $match_string) { | |
$code = hexdec($match_string); // convert to decimal | |
$character = html_entity_decode("&#$code;", ENT_NOQUOTES, 'UTF-8'); // decode decimal into utf8 char | |
$string = str_replace($match_string, $character, $string); // replace the code with the utf8 char | |
} | |
print "Output: $string\n"; | |
?> |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment