Skip to content

Instantly share code, notes, and snippets.

@sirovenmitts
Last active December 18, 2015 08:09
Show Gist options
  • Save sirovenmitts/5751827 to your computer and use it in GitHub Desktop.
Save sirovenmitts/5751827 to your computer and use it in GitHub Desktop.
Guess the number of syllables in a word. Not as good as looking up stuff from the CMU pronunciation dictionary...
"This is a naive implementation of syllable guessing. This is for Amber smalltalk. It probably works elsewhere with minor changes."
| word |
word := 'tumult' asLowercase.
word := word replaceRegexp: '(?:[^laeiouy]es|ed|[^laeiouy]e)$' with: ''.
word := word replaceRegexp: '^y' with: ''.
console log: ( word matchesOf: ( RegularExpression fromString: '[aeiouy]{1,2}' flag: 'g' ) ) size.
"Use the CMU Pronunciation Dictionary to determine the number and stress of syllables in a word. Obviously this only works with words that are in the dictionary."
( 'S IY1 . EH1 M . Y UW1 . D IH1 K SH AH0 N EH2 R IY0' replaceRegexp: ( RegularExpression fromString: '[^\d]' flag: 'g' ) with: '' ) tokenize: ''.
@sirovenmitts
Copy link
Author

I hate using #tokenize: '' to convert the String to an Array; I should hide that behind some message like #asArray.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment