jmcph4 · October 26, 2011 06:50
diff --git a/261011 Class Notes b/261011 Class Notes
 remise			-		What is accepted to be true (fact).
 Probability		-		The strength of an argument.

 100 people
 	80 healthy
 	20 ill
 	
 Hypothesis Testing
 	
 Text Classification
 	Determine document's category
 	
 	Bag-of-Words Assumption
 		Throw all words into bag.
 	
 	A Million Monkeys and a Bag of Spam
 	
 		1	-	FREE!
 		2	-	guarenteed
 		3	-	cash
 		4	-	Administrative
 		
 	Laplace's Rule of Succession
 		
 		1 with word, 0 without word
 		throw: not throw = 2:1
 		1 with word, 1 without words
 		1 with word, 2 without 
 		2:3
 		2with, 3 without = 3:4
 		20 with, 80 without = 21:81
 		
 		spam | ham
 		
 		Python:
 			def prob_spam(bag)
 			log_odds = 0.0
 			for word in bag:
 				log_odds += log(likelihood_ratio(word))
 			return I_spam / I_ham
 			
 			plug.uwaterloo.org
 			
 	Bayesian Poisoning
 		Innocent words with small spam.
 		Defeats Bag-of-Words Assumption.
 		
 		Guess The Number
 			-1,3,7,11,?
 			-1,3,7,11,39
 			
 		Occam's Razor
 			More simple hyptothesis.
 			
 			Given two hypotheses compatible with observations
 			
 		Hypotheses
 			AP 
 				~40000
 				one is -1,3,7,11
 				40000:1
 			P4
 				~320000000000
 				four start with -1,3,7,11
 				80000000000:1
 		
 			Overfitting
 			Too many
 			
 			Bayesian Interplotation produces better sine wave.
 			
 			Lossless Compression
 			Predict contains less data that unpredictable.
 			
 			Hello worl - d
 			
 			Shannon's Guessing Game
 			
 			*Prediction and Entropy of Printed English, Shannon 1950
 			
 			Original	Comparison	Reduced
 				Predictor
 			
 			Guess The Character
 			
 			T - h - e - r - e - _ - i- s - _ - n - o
 			
 			1 - 1 - 1 - 5 - 1 - 1 - 2 - 1 - 1 - 2 - 1 - 1 - 15 - 1 - 17
 			
 			Predict well = compress well
 			
 			Compress well = predict well
 			
 			Learning To Predict
 			
 			Consider every possibile hypothesis, update confidence in hypeotheses according to how well they predict past data.
 			
 			Restrict ourselves to a limited subset of hypotheses.
 			
 			Assume each character occures with some fixed probability according to k characters.
 			
 			Approximate e.g. consider few of the more promising hypotheses.
 			
 		Further Reading
 		
 		Probability Theory
 		bayes.wustl.edu/etj/prob/book.pdf
 		
 		Information Theory, Inference, and Learning
 		inference.phy.cam.ac.uk/itprnn/book.pdf
	remise - What is accepted to be true (fact).
	Probability - The strength of an argument.

	100 people
	80 healthy
	20 ill

	Hypothesis Testing

	Text Classification
	Determine document's category

	Bag-of-Words Assumption
	Throw all words into bag.

	A Million Monkeys and a Bag of Spam

	1 - FREE!
	2 - guarenteed
	3 - cash
	4 - Administrative

	Laplace's Rule of Succession

	1 with word, 0 without word
	throw: not throw = 2:1
	1 with word, 1 without words
	1 with word, 2 without
	2:3
	2with, 3 without = 3:4
	20 with, 80 without = 21:81

	spam \| ham

	Python:
	def prob_spam(bag)
	log_odds = 0.0
	for word in bag:
	log_odds += log(likelihood_ratio(word))
	return I_spam / I_ham

	plug.uwaterloo.org

	Bayesian Poisoning
	Innocent words with small spam.
	Defeats Bag-of-Words Assumption.

	Guess The Number
	-1,3,7,11,?
	-1,3,7,11,39

	Occam's Razor
	More simple hyptothesis.

	Given two hypotheses compatible with observations

	Hypotheses
	AP
	~40000
	one is -1,3,7,11
	40000:1
	P4
	~320000000000
	four start with -1,3,7,11
	80000000000:1

	Overfitting
	Too many

	Bayesian Interplotation produces better sine wave.

	Lossless Compression
	Predict contains less data that unpredictable.

	Hello worl - d

	Shannon's Guessing Game

	*Prediction and Entropy of Printed English, Shannon 1950

	Original Comparison Reduced
	Predictor

	Guess The Character

	T - h - e - r - e - _ - i- s - _ - n - o

	1 - 1 - 1 - 5 - 1 - 1 - 2 - 1 - 1 - 2 - 1 - 1 - 15 - 1 - 17

	Predict well = compress well

	Compress well = predict well

	Learning To Predict

	Consider every possibile hypothesis, update confidence in hypeotheses according to how well they predict past data.

	Restrict ourselves to a limited subset of hypotheses.

	Assume each character occures with some fixed probability according to k characters.

	Approximate e.g. consider few of the more promising hypotheses.

	Further Reading

	Probability Theory
	bayes.wustl.edu/etj/prob/book.pdf

	Information Theory, Inference, and Learning
	inference.phy.cam.ac.uk/itprnn/book.pdf
No results found