Last active
June 9, 2024 18:09
-
-
Save nickovchinnikov/276caa04cfbda2212b9ccf49bb1912e0 to your computer and use it in GitHub Desktop.
NAR models
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"cells": [ | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Here is how we can define the non-autoregressive model:\n", | |
"\n", | |
"$$x_t = f(x, t)$$\n", | |
"\n", | |
"where $f$ is the NAR model, $x$ is the input sequence, and $t$ is the time step.\n", | |
"\n", | |
"In a non-autoregressive model, the output sequence is generated in a single pass, without any sequential dependencies. This can be represented mathematically as:\n", | |
"\n", | |
"$$\\hat{x} = f(x)$$\n", | |
"\n", | |
"where $x$ is the input sequence, $f$ is the NAR model and $\\hat{x}$ is the output sequence.\n", | |
"\n", | |
"To define NAR Sequence Generation in a probabilistic manner, we can use the product rule to decompose the joint probability of the sequence into a product of independent probabilities.\n", | |
"\n", | |
"**NAR Sequence Generation as a Probabilistic Model:** Let's denote the sequence of tokens as $x = (x_1, x_2, \\dots, x_n)$, where $n$ is the sequence length. We can model the joint probability of the sequence using the product rule:\n", | |
"\n", | |
"$P(x) = P(x_1)P(x_2) \\dots P(x_n) = \\prod_{i=1}^nP(x_i)$" | |
] | |
} | |
], | |
"metadata": { | |
"language_info": { | |
"name": "python" | |
} | |
}, | |
"nbformat": 4, | |
"nbformat_minor": 2 | |
} | |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment