Skip to content

Instantly share code, notes, and snippets.

View omarsar's full-sized avatar
🐙

Elvis Saravia omarsar

🐙
View GitHub Profile
@rxwei
rxwei / ad-manifesto.md
Last active December 6, 2024 16:54
First-Class Automatic Differentiation in Swift: A Manifesto
@phillipi
phillipi / biggan_slerp
Last active October 8, 2023 01:25
Slerp through the BigGAN latent space
# to be used in conjunction with the functions defined here:
# https://colab.research.google.com/github/tensorflow/hub/blob/master/examples/colab/biggan_generation_with_tf_hub.ipynb
# party parrot transformation
noise_seed_A = 3 # right facing
noise_seed_B = 31 # left facing
num_interps = 14
truncation = 0.2
category = 14
@aditya-malte
aditya-malte / smallberta_pretraining.ipynb
Created February 22, 2020 13:41
smallBERTa_Pretraining.ipynb
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@veekaybee
veekaybee / chatgpt.md
Last active March 10, 2025 07:45
Everything I understand about chatgpt

ChatGPT Resources

Context

ChatGPT appeared like an explosion on all my social media timelines in early December 2022. While I keep up with machine learning as an industry, I wasn't focused so much on this particular corner, and all the screenshots seemed like they came out of nowhere. What was this model? How did the chat prompting work? What was the context of OpenAI doing this work and collecting my prompts for training data?

I decided to do a quick investigation. Here's all the information I've found so far. I'm aggregating and synthesizing it as I go, so it's currently changing pretty frequently.

Model Architecture

@JoaoLages
JoaoLages / RLHF.md
Last active April 3, 2025 13:14
Reinforcement Learning from Human Feedback (RLHF) - a simplified explanation

Maybe you've heard about this technique but you haven't completely understood it, especially the PPO part. This explanation might help.

We will focus on text-to-text language models 📝, such as GPT-3, BLOOM, and T5. Models like BERT, which are encoder-only, are not addressed.

Reinforcement Learning from Human Feedback (RLHF) has been successfully applied in ChatGPT, hence its major increase in popularity. 📈

RLHF is especially useful in two scenarios 🌟:

  • You can’t create a good loss function
    • Example: how do you calculate a metric to measure if the model’s output was funny?
  • You want to train with production data, but you can’t easily label your production data
@alonsosilvaallende
alonsosilvaallende / LLM_generating_random_numbers.ipynb
Created July 14, 2023 20:20
LLM_generating_random_numbers.ipynb
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@veekaybee
veekaybee / normcore-llm.md
Last active April 30, 2025 19:01
Normcore LLM Reads

Anti-hype LLM reading list

Goals: Add links that are reasonable and good explanations of how stuff works. No hype and no vendor content if possible. Practical first-hand accounts of models in prod eagerly sought.

Foundational Concepts

Screenshot 2023-12-18 at 10 40 27 PM

Pre-Transformer Models