Skip to content

Instantly share code, notes, and snippets.

View tekumara's full-sized avatar
🥚

Oliver Mannion tekumara

🥚
View GitHub Profile
@mitchellh
mitchellh / merge_vs_rebase_vs_squash.md
Last active March 18, 2025 21:32
Merge vs. Rebase vs. Squash

I get asked pretty regularly what my opinion is on merge commits vs rebasing vs squashing. I've typed up this response so many times that I've decided to just put it in a gist so I can reference it whenever it comes up again.

I use merge, squash, rebase all situationally. I believe they all have their merits but their usage depends on the context. I think anyone who says any particular strategy is the right answer 100% of the time is wrong, but I think there is considerable acceptable leeway in when you use each. What follows is my personal and professional opinion:

@hannes
hannes / dlopen.md
Last active January 22, 2025 12:27

Parallel Python within the same process or hacking around the cursed GIL with a hand-rolled library loader

From its obscure beginnings in Amsterdam, the Python programming language has become a fundamental building block of our digital society. It is used literally everywhere and by everyone for a mind-boggingly wide variety of tasks.

Python is also the lingua franca of Data Science, tying together tools for data loading, wrangling, analysis and AI. There is a massive ecosystem of contributed Python packages, which - for example - allows reading every obscure data format under the sun. This makes Python and its ecosystem extremely valuable for analytical data management systems: Users are likely somewhat familiar with Python due to its immense popularity and the ecosystem provides solutions for most data problems. As a result, Python is being integrated into SQL systems, typically through so-called User-Defined Functions (UDFs). For example, [Apach

@veekaybee
veekaybee / normcore-llm.md
Last active April 27, 2025 07:05
Normcore LLM Reads

Anti-hype LLM reading list

Goals: Add links that are reasonable and good explanations of how stuff works. No hype and no vendor content if possible. Practical first-hand accounts of models in prod eagerly sought.

Foundational Concepts

Screenshot 2023-12-18 at 10 40 27 PM

Pre-Transformer Models

@chrisguidry
chrisguidry / stream_subscriber.py
Last active August 16, 2023 01:59
Stream the events from a Prefect Cloud workspace over Websockets
from uuid import UUID
import orjson
import pendulum
import rich.console
from websockets.client import connect
from websockets.exceptions import ConnectionClosedError
from prefect.cli import root
from prefect.cli._types import PrefectTyper
@MarkRoddy
MarkRoddy / parse_s3_access_logs.sql
Last active November 27, 2024 19:35
DuckDB: Query S3 Access Logs
/*
Usage: you'll want to search for the strings <bucket> and <prefix>, and insert the S3 bucket where your access
logs are being delivered. Use (or delete) <prefix> to filter to a subset of your logs.
*/
/*
These commented out configuration settings you can either run yourself in the REPL and source this file using
`.read parse_s3_access_logs.sql`, or you can uncomment them and supply values for yourself.

Reinforcement Learning for Language Models

Yoav Goldberg, April 2023.

Why RL?

With the release of the ChatGPT model and followup large language models (LLMs), there was a lot of discussion of the importance of "RLHF training", that is, "reinforcement learning from human feedback". I was puzzled for a while as to why RL (Reinforcement Learning) is better than learning from demonstrations (a.k.a supervised learning) for training language models. Shouldn't learning from demonstrations (or, in language model terminology "instruction fine tuning", learning to immitate human written answers) be sufficient? I came up with a theoretical argument that was somewhat convincing. But I came to realize there is an additional argumment which not only supports the case of RL training, but also requires it, in particular for models like ChatGPT. This additional argument is spelled out in (the first half of) a talk by John Schulman from OpenAI. This post pretty much

@shawwn
shawwn / example.sh
Created March 6, 2023 05:17
How I run 65B using my fork of llama at https://github.com/shawwn/llama
mp=1; size=7B; # to run 7B
mp=8; size=65B; # to run 65B
for seed in $(randint 1000000)
do
export TARGET_FOLDER=~/ml/data/llama/LLaMA
time python3 -m torch.distributed.run --nproc_per_node $mp example.py --ckpt_dir $TARGET_FOLDER/$size --tokenizer_path $TARGET_FOLDER/tokenizer.model --seed $seed --max_seq_len 2048 --max_gen_len 2048 --count 0 | tee -a ${size}_startrek.txt
done

Notes on Forma

Recently @dragostis at Google released an experimental vector graphics renderer called Forma.

The renderer has a pretty cool set of goals: portability, performance, simplicity, and size. Graphics and GPU computation models are a topic that I'm pretty interested in learning more about personally, and this project seems like an especially accessible / well-written codebase to learn from.

I'm very happy to see this work! The era of rendering vector graphics in GPU compute shaders is upon us, and I have no doubt it we'll start seeing these in production soon, as there's just such a performance advantage over CPU rendering, and I believe trying to run vector 2D graphics through the GPU rasterization pipeline doesn't quite work.

_This code is simpler than Vello (the new name for piet-gpu), focused on vector path rendering. It's also a strong demo of the power of WebGPU, while also having a performant software-only pipe

@lobre
lobre / zig_type_system.md
Last active March 15, 2025 04:56
Zig type system illustrated using ascii diagrams

Zig Type System

Zig aims to be a simple language. It is not easy to define what simple exactly means, but zig is also a low-level programming language that aims for c-compatibility. To reach this goal, it needs good semantics in its type system so that developers have a complete toolbox to manipulate data.

So types in zig are composable, but this can become rapidly overwhelming. See those examples. Are you able to understand them at a glance, as soon as you read them?

*const ?u8
?*const u8
*const [2]u8
@veekaybee
veekaybee / chatgpt.md
Last active March 10, 2025 07:45
Everything I understand about chatgpt

ChatGPT Resources

Context

ChatGPT appeared like an explosion on all my social media timelines in early December 2022. While I keep up with machine learning as an industry, I wasn't focused so much on this particular corner, and all the screenshots seemed like they came out of nowhere. What was this model? How did the chat prompting work? What was the context of OpenAI doing this work and collecting my prompts for training data?

I decided to do a quick investigation. Here's all the information I've found so far. I'm aggregating and synthesizing it as I go, so it's currently changing pretty frequently.

Model Architecture