Skip to content

Instantly share code, notes, and snippets.

@language-engineering
Created October 24, 2012 10:51
Show Gist options
  • Select an option

  • Save language-engineering/3945444 to your computer and use it in GitHub Desktop.

Select an option

Save language-engineering/3945444 to your computer and use it in GitHub Desktop.
from nltk import pos_tag
from sussex_nltk import lemmatize_tagged
from nltk.tag import untag
#Example list of words
words = ['The', 'badgers', 'were', 'eating', 'some', 'berries', 'and', 'jam']
#PoS tag the words
tagged_words = pos_tag(words)
#Lemmatise the words
lemma_words = [lemmatize_tagged(word) for word in t]
#Remove the PoS tags in order to use the lemmas as features
features = untag(lemma_words)
print features
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment