Skip to content

Instantly share code, notes, and snippets.

@fuzzmonkey
Created November 27, 2012 22:46
Show Gist options
  • Select an option

  • Save fuzzmonkey/4157706 to your computer and use it in GitHub Desktop.

Select an option

Save fuzzmonkey/4157706 to your computer and use it in GitHub Desktop.
glitchr_
̥ ̥̥ ̥̥̥ ̥̥̥̥ ̥̥̥̥̥ ̥̥̥̥̥̥̥̊̊̊̊̊̊̊̊̊̊̊ ̥̥̥̥̥̥̥̥̊̊̊̊̊̊̊̊̊̊̊ ̥̥̥̥̥̥̥̥̥̊̊̊̊̊̊̊̊̊̊̊ ̥̥̥̥̥̥̥̥̊̊̊̊̊̊̊̊̊̊̊ ̥̥̥̥̥̥̥̊̊̊̊̊̊̊̊̊̊̊ ̥̥̥̥̥ ̥̥̥̥ ̥̥̥ ̥̥ ̥
If we take a look at the actual characters of the above string
str.split(//).map(&:ord).to_s
=> [10, 805, 32, 805, 805, 32, 805, 805, 805, 32, 805, 805, 805, 805, 32, 805, 805, 805, 805, 805, 32, 805, 805, 805, 805, 805, 805, 805, 778, 778, 778, 778, 778, 778, 778, 778, 778, 778, 778, 32, 805, 805, 805, 805, 805, 805, 805, 805, 778, 778, 778, 778, 778, 778, 778, 778, 778, 778, 778, 32, 805, 805, 805, 805, 805, 805, 805, 805, 805, 778, 778, 778, 778, 778, 778, 778, 778, 778, 778, 778, 32, 805, 805, 805, 805, 805, 805, 805, 805, 778, 778, 778, 778, 778, 778, 778, 778, 778, 778, 778, 32, 805, 805, 805, 805, 805, 805, 805, 778, 778, 778, 778, 778, 778, 778, 778, 778, 778, 778, 32, 805, 805, 805, 805, 805, 32, 805, 805, 805, 805, 32, 805, 805, 805, 32, 805, 805, 32, 805, 10]
Looking at the bulk ok this string, character 805, this appears to be how this character is meant to be displayed. See http://www.phon.ucl.ac.uk/home/wells/ipa-unicode.htm, "Non-spacing diacritics and suprasegmentals", it's part of the unicode characters for the International Phonetic Alphabet. It works like that do you can do things like:
[105,805].pack "U*"
=> "i̥"
I think, which makes that 'voiceless' apparently.. I guess it's quirk of being able to display two characters 'on top of each other' that means you can make stuff like that arrow. You can have fun with this in irb:
[805,805,805,805,805,805,805,805].pack("U*")
=> => "̥̥̥̥̥̥̥̥"
irb(main):033:0>
Interesting, the symbol displayed changes depending on what encoding you use quite substantially:
require 'iconv'
Iconv.list.flatten.each do |enc|
puts "#{enc}: #{Iconv.conv("UTF-8",enc, [805].pack("U"))}" rescue enc
end
This is a pretty good read on unicode stuff - http://www.joelonsoftware.com/articles/Unicode.html.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment