nickovchinnikov · June 9, 2024 18:09
diff --git a/nar_models.ipynb b/nar_models.ipynb
 {
    "cells": [
     {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
       "Here is how we can define the non-autoregressive model:\n",
       "\n",
       "$$x_t = f(x, t)$$\n",
       "\n",
       "where $f$ is the NAR model, $x$ is the input sequence, and $t$ is the time step.\n",
       "\n",
       "In a non-autoregressive model, the output sequence is generated in a single pass, without any sequential dependencies. This can be represented mathematically as:\n",
       "\n",
       "$$\\hat{x} = f(x)$$\n",
       "\n",
       "where $x$ is the input sequence, $f$ is the NAR model and $\\hat{x}$ is the output sequence.\n",
       "\n",
       "To define NAR Sequence Generation in a probabilistic manner, we can use the product rule to decompose the joint probability of the sequence into a product of independent probabilities.\n",
       "\n",
       "**NAR Sequence Generation as a Probabilistic Model:** Let's denote the sequence of tokens as $x = (x_1, x_2, \\dots, x_n)$, where $n$ is the sequence length. We can model the joint probability of the sequence using the product rule:\n",
       "\n",
       "$P(x) = P(x_1)P(x_2) \\dots P(x_n) = \\prod_{i=1}^nP(x_i)$"
      ]
     }
    ],
    "metadata": {
     "language_info": {
      "name": "python"
     }
    },
    "nbformat": 4,
    "nbformat_minor": 2
   }
	{
	"cells": [
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"Here is how we can define the non-autoregressive model:\n",
	"\n",
	"$$x_t = f(x, t)$$\n",
	"\n",
	"where $f$ is the NAR model, $x$ is the input sequence, and $t$ is the time step.\n",
	"\n",
	"In a non-autoregressive model, the output sequence is generated in a single pass, without any sequential dependencies. This can be represented mathematically as:\n",
	"\n",
	"$$\\hat{x} = f(x)$$\n",
	"\n",
	"where $x$ is the input sequence, $f$ is the NAR model and $\\hat{x}$ is the output sequence.\n",
	"\n",
	"To define NAR Sequence Generation in a probabilistic manner, we can use the product rule to decompose the joint probability of the sequence into a product of independent probabilities.\n",
	"\n",
	"NAR Sequence Generation as a Probabilistic Model: Let's denote the sequence of tokens as $x = (x_1, x_2, \\dots, x_n)$, where $n$ is the sequence length. We can model the joint probability of the sequence using the product rule:\n",
	"\n",
	"$P(x) = P(x_1)P(x_2) \\dots P(x_n) = \\prod_{i=1}^nP(x_i)$"
	]
	}
	],
	"metadata": {
	"language_info": {
	"name": "python"
	}
	},
	"nbformat": 4,
	"nbformat_minor": 2
	}