Skip to content

Instantly share code, notes, and snippets.

@Kimtaro
Created May 12, 2011 09:33
Show Gist options
  • Save Kimtaro/968246 to your computer and use it in GitHub Desktop.
Save Kimtaro/968246 to your computer and use it in GitHub Desktop.
Ruby 1.9 regex \w Unicode
# Encoding: UTF-8
#
# Problem: \w in regular expressions should match Unicode characters
# when in Unicode mode*.
# Solution: Use the corresponding Unicode properties directly.
#
# *http://www.geocities.co.jp/kosako3/oniguruma/doc/RE.txt
r = %r{ [\p{Letter}\p{Mark}\p{Number}\p{Connector_Punctuation}]+ }x
puts "かな漢字K3".match(r).inspect
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment