Skip to content

Instantly share code, notes, and snippets.

@fsndzomga
Last active September 7, 2023 18:10
Show Gist options
  • Save fsndzomga/85aa989ff3deef450d00220ac33e257f to your computer and use it in GitHub Desktop.
Save fsndzomga/85aa989ff3deef450d00220ac33e257f to your computer and use it in GitHub Desktop.
stemming and lemmatisation
# using nltk
from nltk.stem import PorterStemmer
stemmer = PorterStemmer()
print(stemmer.stem("running")) # Output: 'run'
print(stemmer.stem("flies")) # Output: 'fli'
import spacy
nlp = spacy.load("en_core_web_sm")
doc = nlp("flies running ran")
lemmas = [token.lemma_ for token in doc]
print(lemmas)
# Output: ['fly', 'run', 'run']
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment