Skip to content

Instantly share code, notes, and snippets.

@matschundbrei
Last active September 5, 2017 14:02
Show Gist options
  • Save matschundbrei/909b09c9dfc9e1fb3cb5 to your computer and use it in GitHub Desktop.
Save matschundbrei/909b09c9dfc9e1fb3cb5 to your computer and use it in GitHub Desktop.
php functions based on mbstring/preg_match that will detect utf8 and fix German encoding if no utf8 is detected.
<?php
//SCHEISS ENCODING!!!
//via: http://php.net/manual/de/function.mb-detect-encoding.php#68607
function detectUTF8($string){
return preg_match('%(?:
[\xC2-\xDF][\x80-\xBF] # non-overlong 2-byte
|\xE0[\xA0-\xBF][\x80-\xBF] # excluding overlongs
|[\xE1-\xEC\xEE\xEF][\x80-\xBF]{2} # straight 3-byte
|\xED[\x80-\x9F][\x80-\xBF] # excluding surrogates
|\xF0[\x90-\xBF][\x80-\xBF]{2} # planes 1-3
|[\xF1-\xF3][\x80-\xBF]{3} # planes 4-15
|\xF4[\x80-\x8F][\x80-\xBF]{2} # plane 16
)+%xs', $string);
}
function fixEncDe($string) {
if(!detectUTF8($string)) {
//if no utf8 chars detected, let's assume it's the usual ISO-BS and convert, dropping all invalid chars on the way:
ini_set('mbstring.substitute_character', "none");
$string= mb_convert_encoding($string, 'UTF-8', 'ISO-8859-15');
return $string;
} else {
//nothing to do, if $string is already utf8
return $string;
}
}
?>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment