Skip to content

Instantly share code, notes, and snippets.

View cakiki's full-sized avatar
🐈‍⬛
meow

Christopher Akiki cakiki

🐈‍⬛
meow
View GitHub Profile
@tokenbender
tokenbender / train_modal_standalone.py
Last active October 12, 2025 06:57
standalone serverless simple character level transformer
import os
import sys
import time
import math
import pickle
from contextlib import nullcontext
from pathlib import Path
import subprocess
from dataclasses import dataclass
import inspect

Learning LLMs in 2025

So you know how the transformer works, and you know basic ML/DL, and you want to learn more about LLMs. One way to go is looking into the various "algorithmic" stuff (optimization algorithms, RL, DPO, etc). Lot's of materials on that. But the interesting stuff is (in my opinion at least) not there.

This is an attempt to collect a list of academic (or academic-like) materials that explore LLMs from other directions, and focus on the non-ML-algorithmic aspects.

Courses

  • David Chiang's Theory of Neural Networks course.
  • This is not primarily LLMs, but does have substantial section on Transformers. Formal/Theory. More of a book than a course.
@fxkamd
fxkamd / bert-tiny-amd.md
Created October 1, 2024 19:06
Solutions to problems with BERT training with tinygrad on AMD GPUs

Thank you to tiny corp for pointing out some problems running BERT training with Tinygrad on AMD GPUs in this Tweet. We had a few engineers at AMD take a look at the problem and they were quickly able to reproduce it.

What they found was an issue related to CWSR (compute wave save restore), which is a mechanism that allows our driver and firmware to preempt and reschedule long-running compute waves on our GPUs. The GFXv11 GPU line requires a workaround to set COMPUTE_PGM_RSRC1.PRIV=1 when dispatching a compute kernel. Normally this is handled by the AQL DISPATCH packet. However, since the Tinygrad implementation leverages a custom runtime, it requires this workaround in its PM4-based dispatch. This patch is specific to GFXv11 GPUs. Other GPUs do not require it and should not use this workaround. The following KFDTest patch can be used as a reference: https://github.com/ROCm/ROCT-Thunk-Interface/commit/507637ed5b82197eecbf483cdc1234939766549a

While inv

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@severo
severo / set_gated.py
Created August 12, 2022 16:00
A function to set the gated parameter on a HF repository
from huggingface_hub.hf_api import ( # type: ignore
REPO_TYPES,
REPO_TYPES_URL_PREFIXES,
HfApi,
_raise_for_status,
)
def update_repo_settings(
hf_api: HfApi,
repo_id: str,
@stefan-it
stefan-it / tpu_vm_cheatsheet.md
Last active December 5, 2024 23:33
TPU VM Cheatsheet

TPU VM Cheetsheat

This TPU VM cheatsheet uses and was tested with the following library versions:

Library Version
JAX 0.3.25
FLAX 0.6.4
Datasets 2.10.1
Transformers 4.27.1
@lmcinnes
lmcinnes / doc_embeddings_with_vectorizers.ipynb
Last active November 9, 2023 04:31
Document Embeddings with the Vectorizers Library
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@Garfounkel
Garfounkel / gpu_tfidf_demo.ipynb
Last active August 6, 2025 02:38
notebooks/gpu_tfidf_demo.ipynb
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@cbuntain
cbuntain / agreement.py
Created March 19, 2020 18:38
Example of using NLTK's agreement package to calculate agreement scores for an annotation task
#
# Author: Cody Buntain
# Date: 19 March 2020
#
# Description:
# This code is an example of uysing the agreement package
#. in NLTK to calculate a number of agreement metrics on
#. a set of annotations. Currently, this code will work
#. with two annotators and multiple labels.
#. You can use Fleiss's Kappa or Krippendorf's Alpha if you
@nstarke
nstarke / resize-ghidra-gui.md
Last active November 12, 2025 13:43
Resize Ghidra GUI for High DPI screens

Resize Ghidra for High DPI screens

If you run Ghidra on a high DPI screen, you will probably find the GUI to be scaled down so small to be almost of no use.

There is a setting that you can adjust to scale the Ghidra GUI:

in $GHIDRA_ROOT/support is a file named launch.properties. In this launch.properties file is the following configuration key:

VMARGS_LINUX=-Dsun.java2d.uiScale=1