Skip to content

Instantly share code, notes, and snippets.

@DerekChia
Last active December 1, 2018 05:17
Show Gist options
  • Save DerekChia/bc038426037bf3c56ff2ac7524eb8110 to your computer and use it in GitHub Desktop.
Save DerekChia/bc038426037bf3c56ff2ac7524eb8110 to your computer and use it in GitHub Desktop.
w2v_generate_training_data
text = "natural language processing and machine learning is fun and exciting"
# Note the .lower() as upper and lowercase does not matter in our implementation
# [['natural', 'language', 'processing', 'and', 'machine', 'learning', 'is', 'fun', 'and', 'exciting']]
corpus = [[word.lower() for word in text.split()]]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment