Skip to content

Instantly share code, notes, and snippets.

@dwf
Created February 17, 2016 22:35
Show Gist options
  • Save dwf/cd0ae7c749a5fcb49ace to your computer and use it in GitHub Desktop.
Save dwf/cd0ae7c749a5fcb49ace to your computer and use it in GitHub Desktop.
Simplification of binary cross-entropy in terms of logits, yielding the familiar gradient.
\documentclass[12pt]{article}
\usepackage{amsmath}
\begin{document}
\begin{eqnarray*}
CE(a, t) & = & -t\log\left(\sigma(a)\right) - (1 - t)\log\left(1 - \sigma(a)\right) \\
& = & -t \log \left(\frac{1}{1 + \exp(a)}\right) - (1 - t)\log\left(1 - \frac{1}{1 + \exp(a)}\right) \\
& = & -t \log \left(\frac{1}{1 + \exp(a)}\right) - (1 - t)\log\left(\frac{\exp(a)}{1 + \exp(a)}\right) \\
& = & t \log \left(1 + \exp(a)\right) - \left[(1 - t)\log(\exp(a)) - \log(1 + \exp(a))\right] \\
&= & t \log \left(1 + \exp(a)\right) - (1 - t)\log(\exp(a)) + (1 - t)\log(1 + \exp(a)) \\
&= & t \log \left(1 + \exp(a)\right) - a + at + (1 - t)\log(1 + \exp(a)) \\
&= & t \log \left(1 + \exp(a)\right) - a + at + \log(1 + \exp(a)) - t\log(1 + \exp(a)) \\
&= & - a + at + \log(1 + \exp(a)) \\
\frac{\partial CE(a, t)}{\partial a} & = & -1 + t + \frac{\exp(a)}{1 + \exp(a)} \\
& = & t - \left(1 - \frac{\exp(a)}{1 + \exp(a)}\right) \\
& = & t - \sigma(a)
\end{eqnarray*}
\end{document}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment