Skip to content

Instantly share code, notes, and snippets.

@Eligijus112
Created March 4, 2020 05:27
Show Gist options
  • Save Eligijus112/42113197e01e6a95d3754787db341178 to your computer and use it in GitHub Desktop.
Save Eligijus112/42113197e01e6a95d3754787db341178 to your computer and use it in GitHub Desktop.
# Defining the window for context
window = 2
# Creating a placeholder for the scanning of the word list
word_lists = []
all_text = []
for text in texts:
# Cleaning the text
text = text_preprocessing(text)
# Appending to the all text list
all_text += text
# Creating a context dictionary
for i, word in enumerate(text):
for w in range(window):
# Getting the context that is ahead by *window* words
if i + 1 + w < len(text):
word_lists.append([word] + [text[(i + 1 + w)]])
# Getting the context that is behind by *window* words
if i - w - 1 >= 0:
word_lists.append([word] + [text[(i - w - 1)]])
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment