Skip to content

Instantly share code, notes, and snippets.

@morrisalp
Last active February 12, 2020 11:50
Show Gist options
  • Save morrisalp/f8a1564078848638fe0a792d4eaeb59c to your computer and use it in GitHub Desktop.
Save morrisalp/f8a1564078848638fe0a792d4eaeb59c to your computer and use it in GitHub Desktop.
Spacy English model with sentence segmentation on newlines
import spacy
nlp = spacy.load('en')
def set_custom_boundaries(doc):
for token in doc[:-1]:
if token.text == "\n":
doc[token.i+1].is_sent_start = True
return doc
nlp.add_pipe(set_custom_boundaries, before = "parser")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment