Skip to content

Instantly share code, notes, and snippets.

@justmytwospence
Created February 26, 2014 05:23
Show Gist options
  • Select an option

  • Save justmytwospence/9223976 to your computer and use it in GitHub Desktop.

Select an option

Save justmytwospence/9223976 to your computer and use it in GitHub Desktop.
Vectorization for text mining. Includes the Porter Stemmer, a custom regular expression tokenizer, and the sklearn term frequency - inverse document frequency vectorizer.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment