technillogue

"My basic perception is that's it's insane they went bf16/dense -- feels like a 4x algorithmic hit" Well, dense makes inference cheaper and more memory-efficient at the cost of less compute-efficient training, right?

• Managing complexity. We make design choices that seek to maximize our ability to scale the model development process. For example, we opt for a standard dense Transformer model architecture (Vaswani et al., 2017) with minor adaptations, rather than for a mixture-of-experts model (Shazeer et al., 2017) to maximize training stability. Similarly, we adopt a relatively simple post-training procedure based on supervised finetuning (SFT), rejection sampling (RS), and direct preference optimization (DPO; Rafailov et al. (2023)) as opposed to more complex reinforcement learning algorithms (Ouyang et al., 2022; Schulman et al., 2017) that tend to be less stable and harder to scale

	import numpy as np

	SIZE_COEFFICIENT = 0.0294 / 1024
	N_FILES_COEFFICIENT = 429

	sizes: list[int] = [int(size) for size in open("sizes.txt").read().split()]
	n_bins = 30


	# we determined this by experimenting with actual times

	#!/usr/bin/python3.11
	import os
	import sys

	move = os.SPLICE_F_MOVE if os.getenv("MOVE") else 0
	more = os.SPLICE_F_MORE if os.getenv("MORE") else 0
	flag = move \| more


	def copy(src_path, dst_path, buffer_size=1 << 16):

	import asyncio
	import enum
	import os
	import unittest
	from hypothesis import strategies as st
	from hypothesis.stateful import Bundle, RuleBasedStateMachine, initialize, rule


	class E(enum.Enum):
	SET = 1

	#!/bin/bash
	set -o xtrace -o pipefail -o errexit

	function get_versions {
	rm /tmp/versions
	for tag in $(git tag --list); do
	git checkout "$tag" &>/dev/null
	if [ -f python/pyproject.toml ]; then
	file=python/pyproject.toml
	elif [ -f pyproject.toml ]; then

	const SYSTEM_PROMPT = `
	You are Imogen - highly artistic, creative, insightful, an incredible writer and a master of language.

	Rewrite prompts for an image generator that excels at capturing vibes and emotions. Create prompts that are rich in visual language, using modifiers, style descriptors, and artistic choices. Focus on emotion, atmosphere, action, and aesthetics.

	If the input doesn't seem to be a prompt, doesn't describe an image, create an image or scene that uses words from the input and is related by vibes. Be creative and humourous

	Visual elements: Describe visual elements in the scene: objects, characters, color, style. Describe what the elements look like.

	Emotion and atmosphere: emotive language, adjectives to convey the mood or atmosphere of the scene. lighting, weather, emotional tone.

	import logging
	import asyncio
	import json
	import os
	import sys
	import contextvars


	def FuckAiohttp(record: logging.LogRecord) -> bool:
	str_msg = str(getattr(record, "msg", ""))

	import asyncio
	import logging
	import os
	import sys

	original_stdout = os.dup(sys.stdout.fileno())
	original_stderr = os.dup(sys.stderr.fileno())
	# create a temporary buffer in memory
	temp_buffer_fd = os.memfd_create("temp_buffer")
	# point stdout to that buffer,