Andrew Johnston billywhizz

How large are large language models? (2025)

This aims to be factual information about the size of large language models. None of this document was written by AI. I do not include any information from leaks or rumors. The focus of this document is on base models (the raw text continuation engines, not 'helpful chatbot/assistants'). This is a view from a few years ago to today of one very tiny fraction of the larger LLM story that's happening.

History

GPT-2,-medium,-large,-xl (2019): 137M, 380M, 812M, 1.61B. Source: openai-community/gpt2. Trained on the unreleased WebText dataset said to 40GB of Internet text - I estimate that to be roughly 10B tokens. You can see a list of the websites that went into that data set here domains.txt.
GPT-3 aka davinci, davinci-002 (2020): 175B parameters. There is a good breakdown of how those parameters are 'spent' here [How d

ARM’s Scalable Vector Extensions: A Critical Look at SVE2 For Integer Workloads

Scalable Vector Extensions (SVE) is ARM’s latest SIMD extension to their instruction set, which was announced back in 2016. A follow-up SVE2 extension was announced in 2019, designed to incorporate all functionality from ARM’s current primary SIMD extension, NEON (aka ASIMD).

Despite being announced 5 years ago, there is currently no generally available CPU which supports any form of SVE (which excludes the [Fugaku supercomputer](https://www.fujitsu.com/global/about/innovation/

Make Ubuntu packages 90% faster by rebuilding them

TL;DR

You can take the same source code package that Ubuntu uses to build jq, compile it again, and realize 90% better performance.

Setting

I use jq for processing GeoJSON files and other open data offered in JSON format. Today I am working with a 500MB GeoJSON file that contains the Alameda County Assessor's parcel map. I want to run a query that prints the city for every parcel worth more than a threshold amount. The program is

	# $ apt install rdiff
	# $ rdiff --help
	# Usage: rdiff [OPTIONS] signature [BASIS [SIGNATURE]]
	# [OPTIONS] delta SIGNATURE [NEWFILE [DELTA]]
	# [OPTIONS] patch BASIS [DELTA [NEWFILE]]

	# Options:
	# -v, --verbose Trace internal processing
	# -V, --version Show program version
	# -?, --help Show this help message

	import { Stats } from './lib/bench.js'
	import { SQL } from "bun";

	const pool_size = 4

	/*
	CREATE TABLE Test (
	id integer NOT NULL,
	PRIMARY KEY (id)
	);

	/**
	* Display a string with padding
	*
	* @param {String} str String to pad
	* @param {Integer} l padding length
	* @param {Boolean} [r] true for right padding
	*
	* @return {String}
	*/
	function pad(str, l, r) {

	all: default

	default: waitpid waitpid_optimized

	waitpid: waitpid.c Makefile
	$(CC) -Wall -Werror -std=gnu17 -ggdb -o waitpid waitpid.c

	waitpid_optimized: waitpid.c Makefile
	$(CC) -Wall -Werror -std=gnu17 -Ofast -o waitpid_optimized waitpid.c

	#include <stdint.h>
	#include <stdlib.h>
	#include <stdbool.h>
	#include <windows.h>
	#pragma comment( lib, "user32.lib" )
	#pragma comment( lib, "gdi32.lib" )

	#define SCRW 640
	#define SCRH 480

	#include <stdlib.h>
	#include <stdio.h>

	#include <sys/ipc.h>
	#include <sys/shm.h>

	#include <xcb/xcb.h>
	#include <xcb/shm.h>
	#include <xcb/xcb_image.h>