This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
$\newcommand{\pderiv}[2]{\frac{\partial #1}{\partial #2}}$ | |
$\newcommand{\lderiv}[1]{\frac{\partial \mathcal L}{\partial #1}}$ | |
$\newcommand{\pderivsq}[2]{\frac{\partial^2 #1}{\partial #2^2}}$ | |
$\newcommand{\numel}[1]{|#1|}$ | |
$\newcommand{\pderivdim}[2]{\overset{\big[\numel {#1} \times \numel {#2} \big]}{\frac{\partial #1}{\partial #2}}}$ | |
$\newcommand{\pderivdimg}[4]{\overset{\big[#3 \times #4 \big]}{\frac{\partial #1}{\partial #2}}}$ |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Distributed Low-Bit Computation | |
Suppose we're trying to communicate a scalar parameter $\theta$ from a worker $W$ to a server $S$. | |
$\theta$ changes with time $t$. The worker simply communicates bits of theta asynchronously - so if it sends a bit $b\in {0, 1}$ at time $t\in \mathbb I^+$ we say that the worker communicated a message $(b, t)$. If the worker sends M messages between times $t_1$ and $t_2$, we say $N_{t_1}^{t_2} = M$ | |
The Server takes in these bits and uses them to build a distribution $p(\hat \theta)$ over the current value of theta. | |
**Can we create an encoding with the following properties?:** |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
$\newcommand{\pderiv}[2]{\frac{\partial #1}{\partial #2}}$ | |
$\newcommand{\pderivsq}[2]{\frac{\partial^2 #1}{\partial #2^2}}$ | |
$\newcommand{\lderiv}[1]{\frac{\partial \mathcal L}{\partial #1}}$ | |
$\newcommand{\norm}[1]{\frac12\| #1 \|_2^2}$ | |
$\newcommand{argmax}[1]{\underset{#1}{\operatorname{argmax}}}$ | |
$\newcommand{argmin}[1]{\underset{#1}{\operatorname{argmin}}}$ | |
$\newcommand{blue}[1]{\color{blue}{#1}}$ | |
$\newcommand{red}[1]{\color{red}{#1}}$ | |
$\newcommand{argmax}[1]{\underset{#1}{\operatorname{argmax}}}$ | |
$\newcommand{argmin}[1]{\underset{#1}{\operatorname{argmin}}}$ |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# 1) Simple Maximum Likelihood | |
F --> X | |
$$ | |
p(F=1 | X=x) = \frac{p(X=x|F=1) p(F=1)}{p(X=x)} = \frac{p(X=x|F=1) p(F=1)}{p(X=x|F=0)p(F=0) + p(X=x|F=1)p(F=1)} | |
$$ |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Temporal Networks | |
$\newcommand{\pderiv}[2]{\frac{\partial #1}{\partial #2}}$ | |
# The idea | |
Let | |
$(x, y)$ be the input, target data, and | |
$u_1, ... u_L$ be the pre-nonlinearity activations of a neural network, and | |
$w_1, ... w_L$ be the parameters and $\cdot w (x) \triangleq x\cdot w$ | |
$h_l(\cdot)$ be the nonlinearity of the $l'th$ layer, and |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Generative Models | |
## Introduction | |
Generative models are models that learn the *distribution* of the data. | |
Suppose we have a collection of N D-Dimensional points: $\{x_1, ..., x_N\}$. Each, $x_i$ might represent a vector of pixels in an image, or the words in a sentence. | |
In generative modeling, we imagine that these points are samples from a D-dimensional probability distribution. The distribution represents whatever real-world process was used to generate that data. Our objective is to learn the parameters of this distribution. This allows us to do things like |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
$$ | |
\newcommand{\pderiv}[2]{\frac{\partial #1}{\partial #2}} | |
\newcommand{\pderivsq}[2]{\frac{\partial^2 #1}{\partial #2^2}} | |
\newcommand{\lderiv}[1]{\frac{\partial \mathcal L}{\partial #1}} | |
\newcommand{\pderivgiven}[3]{\left.\frac{\partial #1}{\partial #2}\right|_{#3}} | |
\newcommand{\norm}[1]{\frac12\| #1 \|_2^2} | |
\newcommand{argmax}[1]{\underset{#1}{\operatorname{argmax}}} | |
\newcommand{argmin}[1]{\underset{#1}{\operatorname{argmin}}} | |
\newcommand{blue}[1]{\color{blue}{#1}} | |
\newcommand{red}[1]{\color{red}{#1}} |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
$$ | |
\newcommand{\pderiv}[2]{\frac{\partial #1}{\partial #2}} | |
\newcommand{\pderivsq}[2]{\frac{\partial^2 #1}{\partial #2^2}} | |
\newcommand{\lderiv}[1]{\frac{\partial \mathcal L}{\partial #1}} | |
\newcommand{\pderivgiven}[3]{\left.\frac{\partial #1}{\partial #2}\right|_{#3}} | |
\newcommand{\norm}[1]{\frac12\| #1 \|_2^2} | |
\newcommand{\argmax}[1]{\underset{#1}{\operatorname{argmax}}} | |
\newcommand{\argmin}[1]{\underset{#1}{\operatorname{argmin}}} | |
\newcommand{\blue}[1]{\color{blue}{#1}} | |
\newcommand{\red}[1]{\color{red}{#1}} |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
$$ | |
\newcommand{\pderiv}[2]{\frac{\partial #1}{\partial #2}} | |
\newcommand{\pderivsq}[2]{\frac{\partial^2 #1}{\partial #2^2}} | |
\newcommand{\lderiv}[1]{\frac{\partial \mathcal L}{\partial #1}} | |
\newcommand{\pderivgiven}[3]{\left.\frac{\partial #1}{\partial #2}\right|_{#3}} | |
\newcommand{\norm}[1]{\frac12\| #1 \|_2^2} | |
\newcommand{\argmax}[1]{\underset{#1}{\operatorname{argmax}}} | |
\newcommand{\argmin}[1]{\underset{#1}{\operatorname{argmin}}} | |
\newcommand{\blue}[1]{\color{blue}{#1}} | |
\newcommand{\red}[1]{\color{red}{#1}} |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
$$ | |
\newcommand{\pderiv}[2]{\frac{\partial #1}{\partial #2}} | |
\newcommand{\pderivsq}[2]{\frac{\partial^2 #1}{\partial #2^2}} | |
\newcommand{\lderiv}[1]{\frac{\partial \mathcal L}{\partial #1}} | |
\newcommand{\pderivgiven}[3]{\left.\frac{\partial #1}{\partial #2}\right|_{#3}} | |
\newcommand{\norm}[1]{\frac12\| #1 \|_2^2} | |
\newcommand{\argmax}[1]{\underset{#1}{\operatorname{argmax}}} | |
\newcommand{\argmin}[1]{\underset{#1}{\operatorname{argmin}}} | |
\newcommand{\blue}[1]{\color{blue}{#1}} | |
\newcommand{\red}[1]{\color{red}{#1}} |
OlderNewer