Skip to content

Instantly share code, notes, and snippets.

@icoxfog417
Last active February 22, 2019 08:20
Show Gist options
  • Save icoxfog417/d9543cd7d2f01eb90c541d90205d72cc to your computer and use it in GitHub Desktop.
Save icoxfog417/d9543cd7d2f01eb90c541d90205d72cc to your computer and use it in GitHub Desktop.
chariot_demo
import chariot.transformer as ct
from chariot.preprocessor import Preprocessor
preprocessor = Preprocessor()
preprocessor\
.stack(ct.text.UnicodeNormalizer())\
.stack(ct.Tokenizer("en"))\
.stack(ct.token.StopwordFilter("en"))\
.stack(ct.Vocabulary(min_df=5, max_df=0.5))\
.fit(train_data)
preprocessor.save("my_preprocessor.pkl")
loaded = Preprocessor.load("my_preprocessor.pkl")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment