Skip to content

Instantly share code, notes, and snippets.

@kshirsagarsiddharth
Created July 21, 2020 05:07
Show Gist options
  • Select an option

  • Save kshirsagarsiddharth/8efb24ac25fece3807a7825c4df01d55 to your computer and use it in GitHub Desktop.

Select an option

Save kshirsagarsiddharth/8efb24ac25fece3807a7825c4df01d55 to your computer and use it in GitHub Desktop.
unicode decomposition
s1 = 'Jalape\u00f1o' # composed form
s2 = 'Jalapen\u0303o' # decomposed form
print(s1,s2)
# Output: Jalapeño Jalapeño
print(s1 == s2)
# Output: False
from unicodedata import normalize
v1 = normalize('NFC',s1)
v2 = normalize('NFC',s2)
print(v1 == v2)
# Output: True
v3 = normalize('NFD',s1)
v4 = normalize('NFD',s2)
print(v3 == v4)
# Output: True
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment