Skip to content

Instantly share code, notes, and snippets.

@rcalsaverini
Created August 30, 2014 15:05
Show Gist options
  • Save rcalsaverini/30bb8212809d29592222 to your computer and use it in GitHub Desktop.
Save rcalsaverini/30bb8212809d29592222 to your computer and use it in GitHub Desktop.
Removing accents from unicode strings in python
import unicodedata
def strip_accents(unicode_string):
"""
Strip accents (all combining unicode characters) from a unicode string.
"""
ndf_string = unicodedata.normalize('NFD', unicode_string)
is_not_accent = lambda char: unicodedata.category(char) != 'Mn'
return ''.join(
char for char in ndf_string if is_not_accent(char)
)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment