KL divergence loss, or Kullback-Leibler divergence loss, measures how one probability distribution diverges from a second, expected probability distribution. It's often used in scenarios where you want to compare two probability distributions, typically in the context of machine learning models like neural networks.
In PyTorch, the KL divergence loss is implemented through the torch.nn.KLDivLoss
class. This loss function computes the divergence between two distributions by comparing a target probability distribution ( Q(x) ) with a predicted probability distribution ( P(x) ). The formula for KL divergence is given by:
[ D_{KL}(P ,||, Q) = \sum P(x) \log \left(\frac{P(x)}{Q(x)}\right) ]
where:
- ( P(x) ) is the true probability distribution.