Venkat Venkatstatistics

Data scientist/Statistician with business acumen. Hoping to amass knowledge and share it throughout my life.

Venkatstatistics / word2vec_demo

Created September 20, 2019 14:56

	# -- coding: utf-8 --

	from gensim.models.word2vec import Word2Vec
	import gensim.downloader as api

	#corpus = api.load('word2vec-google-news-300')
	#corpus = api.load('glove-wiki-gigaword-100')
	#model = api.load('glove-wiki-gigaword-100')
	corpus = api.load('text8') # download the corpus and return it opened as an iterable
	model = Word2Vec(corpus) # train a model from the corpus

Venkatstatistics / Text Pre Processing

Created September 20, 2019 14:57

	# -- coding: utf-8 --

	#lowercasing
	texts=["JOHN","keLLY","ArJUN","SITA"]
	lower_words=[word.lower() for word in texts]
	lower_words

	#Stemming
	import nltk
	import pandas as pd

Venkatstatistics / Spacy models to download

Created October 3, 2019 15:45

	import spacy
	from spacy.lang.en import English
	nlpsm = English()
	sbd = nlpsm.create_pipe('sentencizer')
	nlpsm.add_pipe(sbd)
	import en_vectors_web_lg
	nlplg = en_vectors_web_lg.load()
	nlplg.add_pipe(sbd)

Venkatstatistics / Spacy Basic Tutorial

Created October 4, 2019 18:04

	###Spacy Tutorials###

	## References: https://course.spacy.io/chapter1 ##

	## References: https://spacy.io/usage/spacy-101 ##

	### Learning to work with NLP object ###

	from spacy.lang.en import English
	nlp = English ()