Skip to content

Instantly share code, notes, and snippets.

@shikendon
Created April 24, 2016 04:54
Show Gist options
  • Save shikendon/adf65125b9ff9b9b179368bba38cbd92 to your computer and use it in GitHub Desktop.
Save shikendon/adf65125b9ff9b9b179368bba38cbd92 to your computer and use it in GitHub Desktop.
轉譯 \t \r \n 以外的 Unicode 控制字元成 \uxxxx 的形式
class String
# Escape Unicode control characters except \t \r \n
def escape_unicode_control_characters
gsub(/(?<unicode_control_characters>[\p{C}&&[^\t\r\n]]+)/u) do |_match|
Regexp.last_match[:unicode_control_characters].unpack('U*').map { |i| '\u' + i.to_s(16).rjust(4, '0') }.join
end
end
end
require 'json'
bad_json = "{\"text\":\"String contain unicode \u00008 characters\"}"
JSON.parse(bad_json) # JSON::ParserError: 784: unexpected token at '{"text":"String contain unicode ' ...
JSON.parse(bad_json.escape_unicode_control_characters) # {"text"=>"String contain unicode \u00008 characters"}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment