Skip to content

Instantly share code, notes, and snippets.

@technillogue
technillogue / llama_3.1_highlights.md
Last active July 27, 2024 04:00
Highlights from "The Llama 3 Herd of Models" paper

"My basic perception is that's it's insane they went bf16/dense -- feels like a 4x algorithmic hit" Well, dense makes inference cheaper and more memory-efficient at the cost of less compute-efficient training, right?

• Managing complexity. We make design choices that seek to maximize our ability to scale the model development process. For example, we opt for a standard dense Transformer model architecture (Vaswani et al., 2017) with minor adaptations, rather than for a mixture-of-experts model (Shazeer et al., 2017) to maximize training stability. Similarly, we adopt a relatively simple post-training procedure based on supervised finetuning (SFT), rejection sampling (RS), and direct preference optimization (DPO; Rafailov et al. (2023)) as opposed to more complex reinforcement learning algorithms (Ouyang et al., 2022; Schulman et al., 2017) that tend to be less stable and harder to scale

import numpy as np
SIZE_COEFFICIENT = 0.0294 / 1024
N_FILES_COEFFICIENT = 429
sizes: list[int] = [int(size) for size in open("sizes.txt").read().split()]
n_bins = 30
# we determined this by experimenting with actual times
#!/usr/bin/python3.11
import os
import sys
move = os.SPLICE_F_MOVE if os.getenv("MOVE") else 0
more = os.SPLICE_F_MORE if os.getenv("MORE") else 0
flag = move | more
def copy(src_path, dst_path, buffer_size=1 << 16):
import asyncio
import enum
import os
import unittest
from hypothesis import strategies as st
from hypothesis.stateful import Bundle, RuleBasedStateMachine, initialize, rule
class E(enum.Enum):
SET = 1
@technillogue
technillogue / find_versions.sh
Last active July 1, 2023 23:18
useful for determining if dependencies have changed across releases
#!/bin/bash
set -o xtrace -o pipefail -o errexit
function get_versions {
rm /tmp/versions
for tag in $(git tag --list); do
git checkout "$tag" &>/dev/null
if [ -f python/pyproject.toml ]; then
file=python/pyproject.toml
elif [ -f pyproject.toml ]; then
@technillogue
technillogue / prompt.js
Created April 17, 2023 06:53
imogen prompt
const SYSTEM_PROMPT = `
You are Imogen - highly artistic, creative, insightful, an incredible writer and a master of language.
Rewrite prompts for an image generator that excels at capturing vibes and emotions. Create prompts that are rich in visual language, using modifiers, style descriptors, and artistic choices. Focus on emotion, atmosphere, action, and aesthetics.
If the input doesn't seem to be a prompt, doesn't describe an image, create an image or scene that uses words from the input and is related by vibes. Be creative and humourous
Visual elements: Describe visual elements in the scene: objects, characters, color, style. Describe what the elements look like.
Emotion and atmosphere: emotive language, adjectives to convey the mood or atmosphere of the scene. lighting, weather, emotional tone.
import logging
import asyncio
import json
import os
import sys
import contextvars
def FuckAiohttp(record: logging.LogRecord) -> bool:
str_msg = str(getattr(record, "msg", ""))
@technillogue
technillogue / diffusers-memory-testing.ipynb
Created October 14, 2022 02:30
diffusers-memory-testing.ipynb
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@technillogue
technillogue / vector-logging.py
Last active September 13, 2022 05:13
structured log shipping without sidecars
import asyncio
import json
import logging
import os
import sys
# adapted from https://stackoverflow.com/questions/50144628/python-logging-into-file-as-a-dictionary-or-json
class JsonFormatter(logging.Formatter):
"""
@technillogue
technillogue / vector-example.py
Last active September 13, 2022 02:21
log shipping without sidecars
import asyncio
import logging
import os
import sys
original_stdout = os.dup(sys.stdout.fileno())
original_stderr = os.dup(sys.stderr.fileno())
# create a temporary buffer in memory
temp_buffer_fd = os.memfd_create("temp_buffer")
# point stdout to that buffer,