Skip to content

Instantly share code, notes, and snippets.

@bkazez
Last active July 23, 2023 22:02
Show Gist options
  • Save bkazez/0c2583ede268bd2a66e89a484cc32659 to your computer and use it in GitHub Desktop.
Save bkazez/0c2583ede268bd2a66e89a484cc32659 to your computer and use it in GitHub Desktop.
Accent Removal Benchmark 2
require 'benchmark'
require 'i18n'
I18n.config.available_locales = :en
COMBINING_DIACRITICS = [*0x1DC0..0x1DFF, *0x0300..0x036F, *0xFE20..0xFE2F].pack('U*').freeze
def removeaccents(str)
str
.unicode_normalize(:nfd) # Decompose characters
.tr(COMBINING_DIACRITICS, '')
.unicode_normalize(:nfc) # Recompose characters
end
lines = File.readlines(File.expand_path("~/Desktop/1200_strings_average_length_110_chars.txt"))
Benchmark.bmbm do |x|
x.report("I18n") { lines.each { |line| I18n.transliterate(line) } }
x.report("removeaccents") { lines.each { |line| removeaccents(line) } }
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment