Created
September 6, 2012 09:05
-
-
Save oscar-broman/3653399 to your computer and use it in GitHub Desktop.
UTF8 encode array/object structure in PHP
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<?php | |
function utf8_encode_deep(&$input) { | |
if (is_string($input)) { | |
$input = utf8_encode($input); | |
} else if (is_array($input)) { | |
foreach ($input as &$value) { | |
utf8_encode_deep($value); | |
} | |
unset($value); | |
} else if (is_object($input)) { | |
$vars = array_keys(get_object_vars($input)); | |
foreach ($vars as $var) { | |
utf8_encode_deep($input->$var); | |
} | |
} | |
} | |
?> |
Hi there, i have optimized the code a bit so I am at this point now:
But I have a big problem with XML of simplexml_load_string() since this works only with UTF8, but this Code does not work for xml. Any Ideas?
<?php
final class Tools
{
/**
* UTF8 de- oder en-codes a total Object/Array.
* WARNING: De-/Encodes Only the Values, not the keys!
* @version 17.07.2015 NS: Created
* @version 17.02.2016 NS: html_entity_decode/preg_replace, $b_entity_replace inserted!
* -> Now undefined ISO characters get replaced by its entities when decoding UTF-8 and vice versa.
* @version 01.03.2016 NS: WARNING: This function does not work for SimpleXMLElement's
*
* @param mixed $input The Input (Array/Object/String-Mix)
* @param bool $b_encode enocde or decode?
* @param bool $b_entity_replace New parameter to define, whether its ok to replace entities.
* -> There is barely no reason to set this to FALSE except it does not work or takes too much time, no errors found, yet.
*
* @return mixed The de-/encoded Object-/Array-/String- value.
*/
static function utf8_code_deep($input, $b_encode = TRUE, $b_entity_replace = TRUE)
{
if (is_string($input))
{
if($b_encode)
{
$input = utf8_encode($input);
//return Entities to UTF8 characters
//important for interfaces to blackbox-pages to send the correct UTF8-Characters and not Entities.
if($b_entity_replace)
{
$input = html_entity_decode($input, ENT_NOQUOTES/* | ENT_HTML5*/, 'UTF-8'); //ENT_HTML5 is a PHP 5.4 Parameter.
}
}
else
{
//Replace NON-ISO Characters with their Entities to stop setting them to '?'-Characters.
if($b_entity_replace)
{
$input = preg_replace("/([\304-\337])([\200-\277])/e", "'&#'.((ord('\\1')-192)*64+(ord('\\2')-128)).';'", $input);
}
$input = utf8_decode($input);
}
}
elseif (is_array($input))
{
foreach ($input as &$value)
{
$value = self::utf8_code_deep($value, $b_encode, $b_entity_replace);
}
}
elseif (is_object($input))
{
$vars = array_keys(get_object_vars($input));
if(get_class($input) == 'SimpleXMLElement')
{
//DOES NOT WORK!
return '';
}
foreach ($vars as $var)
{
$input->$var = self::utf8_code_deep($input->$var, $b_encode, $b_entity_replace);
}
}
return $input;
}
}
?>
charliexyx
`<?php
final class Tools
{
static function utf8_code_deep($input, $b_encode = TRUE, $b_entity_replace = TRUE)
{
if (is_string($input))
{
if($b_encode)
{
$input = utf8_encode($input);
//return Entities to UTF8 characters
//important for interfaces to blackbox-pages to send the correct UTF8-Characters and not Entities.
if($b_entity_replace)
{
$input = html_entity_decode($input, ENT_NOQUOTES/* | ENT_HTML5*/, 'UTF-8'); //ENT_HTML5 is a PHP 5.4 Parameter.
}
}
else
{
//Replace NON-ISO Characters with their Entities to stop setting them to '?'-Characters.
if($b_entity_replace)
{
$input = preg_replace("/([\304-\337])([\200-\277])/e", "'&#'.((ord('\\1')-192)*64+(ord('\\2')-128)).';'", $input);
}
$input = utf8_decode($input);
}
return $input;
}
elseif (is_array($input))
{
foreach ($input as &$value)
{
$value = self::utf8_code_deep($value, $b_encode, $b_entity_replace);
}
return $input;
}
elseif (is_object($input))
{
foreach ($input as $k=>$val)
{
$input->$k = self::utf8_code_deep($input->$val, $b_encode, $b_entity_replace);
}
}
}
}
?>`
Thanks for the idea, tipochka, but it still does not work. Here is an example for non-working code, since I got no idea how to change the different -Elements. In the follwing example for the line "foreach ($input as $k=>$val)" $k is twice 'bar'. That occurs errors. And foreach by reference is not possible here (Fatal-Error).
$xml_string = "<?xml version='1.0'?><foo><bar><bar_string><![CDATA[example1ÄÖÜ]]></bar_string></bar><bar><bar_string><![CDATA[example2ÄÖÜ]]></bar_string></bar></foo>"
//must be UTF8 to work fine with this function.
$xml = simplexml_load_string($xml_string);
//Now I cannot decode.
$xml_utf8_decoded = Tools::utf8_code_deep($xml, FALSE);
thanks :)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Great Man!! Thanks!