Ákos Kádár akoskadar

akoskadar / tf_idf.py

Created November 8, 2013 00:03 — forked from vineetrok/tf_idf.py

	import glob
	import math
	line=''
	s=set()
	flist=glob.glob(r'E:\PROGRAMMING\PYTHON\programs\corpus2\*.txt') #get all the files from the d`#open each file >> tokenize the content >> and store it in a set
	for fname in flist:
	tfile=open(fname,"r")
	line=tfile.read() # read the content of file and store in "line"
	tfile.close() # close the file
	s=s.union(set(line.split(' '))) # union of common words

akoskadar / gist:7359046

Created November 7, 2013 18:01 — forked from joshuaboy7/WordCount

	import string

	import csv

	"""
	fil = open('C:\\Python27\\README.txt')
	new_file = open('C:\Python27\\freq_list.txt', 'w')
	"""

	fil = open("/Users/StefanCelMare/Desktop/PythonReadme.txt")

akoskadar / Document Term Frequency

Created November 7, 2013 17:42

This piece of Pyhton code counts the number of times words occur in a text file, and writes the result in a separate text file.

	import string


	fil = open('C:\\Python27\\elo.txt' , "r")
	new_file = open('C:\Python27\\freq_list.txt', 'w')
	text = fil.read()
	fil.close()


	textClean = ''