Skip to content

Instantly share code, notes, and snippets.

@mumumu
Created February 21, 2013 08:26
Show Gist options
  • Save mumumu/5003175 to your computer and use it in GitHub Desktop.
Save mumumu/5003175 to your computer and use it in GitHub Desktop.
jdom claims "....." is not legal for a JDOM character content: 0x... is not a legal XML character. for workaround, i used the following method. From http://blog.mark-mclaren.info/2007/02/invalid-xml-characters-when-valid-utf8_5873.html
/**
* This method ensures that the output String has only
* valid XML unicode characters as specified by the
* XML 1.0 standard. For reference, please see
* <a href="http://www.w3.org/TR/2000/REC-xml-20001006#NT-Char">the
* standard</a>. This method will return an empty
* String if the input is null or empty.
*
* @param in The String whose non-valid characters we want to remove.
* @return The in String, stripped of non-valid characters.
*/
public String skipInValidXMLChars(String in) {
StringBuffer out = new StringBuffer();
char current;
if (in == null || ("".equals(in))) return "";
for (int i = 0; i < in.length(); i++) {
current = in.charAt(i);
if ((current == 0x9) ||
(current == 0xA) ||
(current == 0xD) ||
((current >= 0x20) && (current <= 0xD7FF)) ||
((current >= 0xE000) && (current <= 0xFFFD)) ||
((current >= 0x10000) && (current <= 0x10FFFF)))
out.append(current);
}
return out.toString();
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment