Using a custom tagger for python nltk.
Problem: most of examples are buggy, need to stick to nltk 3.0.1 :
pip install nltk==3.0.1
| When I checked org.neo4j.kernel.impl.TransactionEvents I saw that some methods were private. | |
| [('When', 'WRB'), ('I', 'PRP'), ('checked', 'VBD'), ('org.neo4j.kernel.impl.TransactionEvents', 'CCN'), ('I', 'PRP'), ('saw', 'VBD'), ('that', 'IN'), ('some', 'DT'), ('methods', 'NNS'), ('were', 'VBD'), ('private', 'JJ'), ('.', '.')] | |
| Do you think that `getEvents()` can be some kind of useful when using `bool`? | |
| [('Do', 'NNP'), ('you', 'PRP'), ('think', 'VBP'), ('that', 'IN'), ('`getEvents', 'NNS'), ('(', 'VBP'), (')', ':'), ('`', '``'), ('can', 'MD'), ('be', 'VB'), ('some', 'DT'), ('kind', 'NN'), ('of', 'IN'), ('useful', 'JJ'), ('when', 'WRB'), ('using', 'VBG'), ('`bool`', 'NN'), ('?', '.')] | |
| Process finished with exit code 0 |
| from nltk.corpus import brown | |
| import nltk.tag, nltk.data | |
| document = ''' | |
| Scores of people were already lying dead or injured inside a crowded Orlando nightclub, | |
| and the police had spent hours trying to connect with the gunman and end the situation without further violence. | |
| But when Omar Mateen threatened to set off explosives, the police decided to act, and pushed their way through a | |
| wall to end the bloody standoff. | |
| ''' | |
| document2 = ''' | |
| When I checked org.neo4j.kernel.impl.TransactionEvents I saw that some methods were private. | |
| Do you think that `getEvents()` can be some kind of useful when using `bool`? | |
| ''' | |
| default_tagger = nltk.data.load(nltk.tag._POS_TAGGER) | |
| patterns = [ | |
| (r'(.*\..*){2,}', 'CCN') | |
| ] | |
| regexp_tagger = nltk.RegexpTagger(patterns, backoff=default_tagger) | |
| sentences = nltk.sent_tokenize(document2) | |
| for s in sentences: | |
| print(s) | |
| text = nltk.word_tokenize(s) | |
| print( regexp_tagger.tag(text) ) |