Skip to content

Instantly share code, notes, and snippets.

@ZweiSteinSoft
Created March 1, 2012 00:51
Show Gist options
  • Save ZweiSteinSoft/1946294 to your computer and use it in GitHub Desktop.
Save ZweiSteinSoft/1946294 to your computer and use it in GitHub Desktop.
PHP: Decode UTF-8 characters in a string. Really an extended htmlentities.
<?php
function utf8_decode2($strText)
{
// Only do the slow convert if there are 8-bit characters
// avoid using 0xA0 (\240) in preg ranges. RH73 does not like that
if (! preg_match('/[\200-\237]/', $strText) && ! preg_match('/[\241-\377]/', $strText))
{
return $strText;
}
// decode three byte unicode characters
$strText = preg_replace("/([\340-\357])([\200-\277])([\200-\277])/e",
"'&#'.((ord('\\1')-224)*4096 + (ord('\\2')-128)*64 + (ord('\\3')-128)).';'",
$strText);
// decode two byte unicode characters
$strText = preg_replace("/([\300-\337])([\200-\277])/e",
"'&#'.((ord('\\1')-192)*64+(ord('\\2')-128)).';'",
$strText);
return $strText;
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment