Last active
September 28, 2022 07:21
-
-
Save swayson/86c296aa354a555536e6765bbe726ff7 to your computer and use it in GitHub Desktop.
Numpy and scipy ways to calculate KL Divergence.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
""" | |
Specifically, the Kullback–Leibler divergence from Q to P, denoted DKL(P‖Q), is | |
a measure of the information gained when one revises one's beliefs from the | |
prior probability distribution Q to the posterior probability distribution P. In | |
other words, it is the amount of information lost when Q is used to approximate | |
P. | |
""" | |
import numpy as np | |
from scipy.stats import entropy | |
def kl(p, q): | |
"""Kullback-Leibler divergence D(P || Q) for discrete distributions | |
Parameters | |
---------- | |
p, q : array-like, dtype=float, shape=n | |
Discrete probability distributions. | |
""" | |
p = np.asarray(p, dtype=np.float) | |
q = np.asarray(q, dtype=np.float) | |
return np.sum(np.where(p != 0, p * np.log(p / q), 0)) | |
def kl(p, q): | |
"""Kullback-Leibler divergence D(P || Q) for discrete distributions | |
Parameters | |
---------- | |
p, q : array-like, dtype=float, shape=n | |
Discrete probability distributions. | |
""" | |
p = np.asarray(p, dtype=np.float) | |
q = np.asarray(q, dtype=np.float) | |
return np.sum(np.where(p != 0, p * np.log(p / q), 0)) | |
p = [0.1, 0.9] | |
q = [0.1, 0.9] | |
assert entropy(p, q) == kl(p, q) |
Hi
You mentioned about p, q discrete probabilities which you created manually. but in real life meaching learning, what value we can use, e.g. If I am using RandomForest classifier it gives me predict_proba() a probability values can I use them if Yes then would it be P or Q and if P then from where can I get Q vise versa?
Unless I am mistaken. The p != 0 should be q != 0. Because you can multiply by 0 but you cannot divide by 0. And in your fliped KL implementation you are dividing by q not p.
Note that scipy.stats.entropy(pk, qk=None, base=None, axis=0)
does compute KL if qk
is not None.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
@rodrigobdz please note that those are equivalent except for the sign and the formulation of the KL-divergence with
np.log(q/p)
hence has a leading negation which is not the case here, meaning the script is correct this way (cf. wikipedia)