Skip to content

Instantly share code, notes, and snippets.

@jmcph4
Created October 26, 2011 06:50
Show Gist options
  • Select an option

  • Save jmcph4/1315640 to your computer and use it in GitHub Desktop.

Select an option

Save jmcph4/1315640 to your computer and use it in GitHub Desktop.
Class notes on Probability Theory from University of Queensland QIP program.
remise - What is accepted to be true (fact).
Probability - The strength of an argument.
100 people
80 healthy
20 ill
Hypothesis Testing
Text Classification
Determine document's category
Bag-of-Words Assumption
Throw all words into bag.
A Million Monkeys and a Bag of Spam
1 - FREE!
2 - guarenteed
3 - cash
4 - Administrative
Laplace's Rule of Succession
1 with word, 0 without word
throw: not throw = 2:1
1 with word, 1 without words
1 with word, 2 without
2:3
2with, 3 without = 3:4
20 with, 80 without = 21:81
spam | ham
Python:
def prob_spam(bag)
log_odds = 0.0
for word in bag:
log_odds += log(likelihood_ratio(word))
return I_spam / I_ham
plug.uwaterloo.org
Bayesian Poisoning
Innocent words with small spam.
Defeats Bag-of-Words Assumption.
Guess The Number
-1,3,7,11,?
-1,3,7,11,39
Occam's Razor
More simple hyptothesis.
Given two hypotheses compatible with observations
Hypotheses
AP
~40000
one is -1,3,7,11
40000:1
P4
~320000000000
four start with -1,3,7,11
80000000000:1
Overfitting
Too many
Bayesian Interplotation produces better sine wave.
Lossless Compression
Predict contains less data that unpredictable.
Hello worl - d
Shannon's Guessing Game
*Prediction and Entropy of Printed English, Shannon 1950
Original Comparison Reduced
Predictor
Guess The Character
T - h - e - r - e - _ - i- s - _ - n - o
1 - 1 - 1 - 5 - 1 - 1 - 2 - 1 - 1 - 2 - 1 - 1 - 15 - 1 - 17
Predict well = compress well
Compress well = predict well
Learning To Predict
Consider every possibile hypothesis, update confidence in hypeotheses according to how well they predict past data.
Restrict ourselves to a limited subset of hypotheses.
Assume each character occures with some fixed probability according to k characters.
Approximate e.g. consider few of the more promising hypotheses.
Further Reading
Probability Theory
bayes.wustl.edu/etj/prob/book.pdf
Information Theory, Inference, and Learning
inference.phy.cam.ac.uk/itprnn/book.pdf
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment