A Map for Studying Pre-training in LLMs
- Data Collection
- General Text Data
- Specialized Data
- Data Preprocessing
- Quality Filtering
- Deduplication
import gradio as gr | |
import numpy as np | |
import torch | |
from PIL import Image | |
''' | |
TODOs: | |
- Fetch the SAM model | |
- Fetch the inpainting model |
Bahdanau Attention is often called Additive Attention because of the mathematical formulation used to compute the attention scores. In contrast to Dot-Product (Multiplicative) Attention, Bahdanau Attention relies on addition and a non-linear activation function.
Let's go through the math step-by-step:
def shape_list(tensor: Union[tf.Tensor, np.ndarray]) -> List[int]: | |
""" | |
Deal with dynamic shape in tensorflow cleanly. | |
Args: | |
tensor (`tf.Tensor` or `np.ndarray`): The tensor we want the shape of. | |
Returns: | |
`List[int]`: The shape of the tensor as a list. | |
""" |