rain-1 / cuda-beginners-guide.md

Last active November 20, 2025 10:56

cuda-beginners-guide.md

Tutorial: Vector Addition

In this post we will cover how to solve the simplest CUDA problem: adding two arrays. I'll explain the code, step by step

Step 1 - Understanding the template

Here is the initial template that we are given:

rain-1 / base model trends.md

Last active October 19, 2025 08:19

base model trends.md

How large are large language models? (2025)

This aims to be factual information about the size of large language models. None of this document was written by AI. I do not include any information from leaks or rumors. The focus of this document is on base models (the raw text continuation engines, not 'helpful chatbot/assistants'). This is a view from a few years ago to today of one very tiny fraction of the larger LLM story that's happening.

History

GPT-2,-medium,-large,-xl (2019): 137M, 380M, 812M, 1.61B. Source: openai-community/gpt2. Trained on the unreleased WebText dataset said to 40GB of Internet text - I estimate that to be roughly 10B tokens. You can see a list of the websites that went into that data set here domains.txt.
GPT-3 aka davinci, davinci-002 (2020): 175B parameters. There is a good breakdown of how those parameters are 'spent' here [How d

rain-1 / organize images using AI.md

Last active August 30, 2025 10:10

organize images using AI

Organize Images into folders using AI

This is a tool that sorts images into folders using "AI". You create the folders you want the images to be put into.

I got 'claude' to write with a couple prompts. Make venv and install deps with pip.

Use microsoft/git-large-coco To turn images into short descriptions of images - https://huggingface.co/microsoft/git-large-coco
Use all-MiniLM-L6-v2 to turn short descriptions into 'embedding vectors' - https://www.sbert.net/docs/sentence_transformer/pretrained_models.html
Compare the embedding vector of the image, with the embedding vectors of the folders - choose the best folder to put the image into

rain-1 / howtoloom.md

Last active October 1, 2024 19:52

How to get started with the loom

What is this

This is a guide on how to set up a way to loom with a local LLM on your computer. Loom is the name for a way of making use of an LLM base model, such as GPT-2, in order to read, write and explore generated text.

To loom you can use the following software stack:

Obsidian with Loomsidian plugin
llama.cpp llama-server running GPT2

Obsidian

rain-1 / On Llamafile.md

Last active June 25, 2024 14:31

On Llamafile

On Llamafile not making sense

The LLamafile project doesn't make sense.

The claim is that it is "bringing LLMs to the people", but you could already run an LLM - which is a large binary file containing lots of floating point numbers - by using llama.cpp.

Llamafile joins a compiled binary program to run LLMs with a weights binary into a single file. This isn't a useful goal. you could simply distribute a zip containing an .exe and a weights file together. Or better still: Decouple the program that runs these chatbots from the chatbot weights.

Imagine if PNG files were also an executable that could pop open a window that displays a PNG on your computer. There is a reason we don't do this: It's not good engineering.

rain-1 / Scheme WASM Tail Call Situation.md

Created November 8, 2023 14:24

Scheme WASM Tail Call Situation

There is a specification for a tail call instruction in wasm. github.com/WebAssembly/tail-call https://v8.dev/blog/wasm-tail-call
Browsers, e.g. Chrome, implement this. https://webassembly.org/roadmap/
At least one scheme implementation called Guile-Hoot targets this https://gitlab.com/spritely/guile-hoot ABI spritely.institute/Status

This spec seems to have gotten in thanks to work by apignotti, https://hn.algolia.com/?q=WebAssembly+tail+calls

There is also a very interesting project for generalized effect handlers that may build on top of this platform https://wasmfx.dev/community/

Great news for schemers with web browsers.

rain-1 / 000-crossword-solving.md

Last active April 18, 2025 17:06

crossword-solving-with-gpt.md

Solving crosswords with GPT

This is my research report. I've included a lot of the code and chat interactions for people to read through if interested. I worked on this crossword https://www.theguardian.com/crosswords/quick/16553

I had a vision for a GPT powered crossword solver. My idea is that it would do a tree search over GPT generated guesses that would include the knowns so far, like:

I didn't end up doing that because ChatGPT and GPT-4 are terrible at questions involving the length of words, or guessing words that contain specific letters at specific locations. It can sometimes do them but usually fails. I think this is because it's token based. I am curious whether a character based LLM would be better at such tasks.

rain-1 / llama-home.md

Last active June 24, 2025 11:12

How to run Llama 13B with a 6GB graphics card

This worked on 14/May/23. The instructions will probably require updating in the future.

llama is a text prediction model similar to GPT-2, and the version of GPT-3 that has not been fine tuned yet. It is also possible to run fine tuned versions (like alpaca or vicuna with this. I think. Those versions are more focused on answering questions)

Note: I have been told that this does not support multiple GPUs. It can only use a single GPU.

It is possible to run LLama 13B with a 6GB graphics card now! (e.g. a RTX 2060). Thanks to the amazing work involved in llama.cpp. The latest change is CUDA/cuBLAS which allows you pick an arbitrary number of the transformer layers to be run on the GPU. This is perfect for low VRAM.

Clone llama.cpp from git, I am on commit 08737ef720f0510c7ec2aa84d7f70c691073c35d.

rain-1 / Prompt Injection and AutoGPT.md

Last active September 11, 2023 11:12

Prompt Injection and AutoGPT

Does prompt injection matter to AutoGPT?

Executive summary: If you use AutoGPT, you need to be aware of prompt injection. This is a serious problem that can cause your AutoGPT agent to perform unexpected and unwanted tasks. Unfortunately, there isn't a perfect solution to this problem available yet.

Prompt injection can derail agents

If you set up an AutoGPT agent to perform task A, a prompt injection could 'derail' it into performing task B instead. Task B could be anything. Even something unwanted like deleting your personal files or sending all your bitcoins to some crooks wallet.

Docker helps limit the file system access that agents have. Measures like this are extremely useful. It's important to note that the agent can still be derailed.

rain-1 / bard-lstm.md

Last active May 7, 2023 22:09

bard-lstm.md

This is a report on my experience pair programming with Bard on a neural network task that challenged it to its current limits.

Bard now has the ability to program, or put another way Google has removed the gating that blocked it from trying.

All the code in this article is basically 99% produced by Bard. I either prompted it to refactor things or I just tweaked one line or two lines of every 100.

Note: I used gpt-4 a little bit too, for the training part, but this is mostly Bard.

rain1 rain-1