Skip to content

Instantly share code, notes, and snippets.

@mrdrozdov
Created November 6, 2018 04:54
Show Gist options
  • Save mrdrozdov/af8cce0bf0848487dbfedb2b3a81c6a7 to your computer and use it in GitHub Desktop.
Save mrdrozdov/af8cce0bf0848487dbfedb2b3a81c6a7 to your computer and use it in GitHub Desktop.
punct.txt
punctuation_words = set(['.', ',', ':', '-LRB-', '-RRB-', '\'\'', '``', '--', ';', '-', '?', '!', '...', '-LCB-', '-RCB-'])
currency_tags_words = set(['#', '$', 'C$', 'A$'])
ellipsis = set(['*', '*?*', '0', '*T*', '*ICH*', '*U*', '*RNR*', '*EXP*', '*PPA*', '*NOT*'])
other = set(['HK$', '&', '**'])
@mrdrozdov
Copy link
Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment