Skip to content

Instantly share code, notes, and snippets.

@Steboss89
Created April 12, 2022 15:22
Show Gist options
  • Save Steboss89/f9d78469c8f18ccc32efbbd5aa620052 to your computer and use it in GitHub Desktop.
Save Steboss89/f9d78469c8f18ccc32efbbd5aa620052 to your computer and use it in GitHub Desktop.
Second approach use regex
# this is a test string with numbers to be found
test_text = "this is a string with one number and then twenty thousand numbers and three thousand thirty four and three thousand five hundred forty five numbers"
# firstly we could think of a simple regex to match numbers
regex = r"\b(three thousand five hundred forty five|three thousand thirty four|twenty thousand|three thousand|forty five|thirty four|twenty|five|four|three|two|one)\b"
re.findall(regex, test_text)
# the result is not we were expecting
# recalibrate the order from "rare" numbers to "frequent" ones
regex = r"\b(three thousand five hundred forty five|three thousand thirty four|twenty thousand|three thousand|forty five|thirty four|twenty|five|four|three|two|one)\b"
re.findall(regex, test_text)
# better result, we got what expected, can we do even better?
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment