Skip to content

Instantly share code, notes, and snippets.

#!/usr/bin/env -S uv run --script
# /// script
# requires-python = ">=3.10"
# dependencies = [
# "numpy>=1.26",
# "torch>=2.4",
# "transformers>=4.45",
# "accelerate>=1.12.0",
# "gguf>=0.17.1",
# ]
@kyo-takano
kyo-takano / the-poor-mans-guide-to-cloud-gpu-selection.md
Created January 26, 2026 10:53
The Poor Man’s Guide to Cloud GPU Selection

cost-efficiency

Compute obtained per dollar varies significantly by GPU and arithmetic intensity. According to Runpod's pricing, when pre-training LLMs with `batch_size=1024` (tokens), the L4 offers superior cost-performance for models under 0.5B parameters, while the H100 dominates for larger scales.

The Poor Man’s Guide to Cloud GPU Selection

description tools
Expert educator specializing in building mental models and deep understanding. Creates visualizations and connects concepts to enhance learning and memory retention.
changes
codebase
editFiles
extensions
fetch
findTestFiles
githubRepo
new
problems
runInTerminal
runNotebooks
runTasks
runTests
search
searchResults
terminalLastCommand
terminalSelection
testFailure
usages
vscodeAPI

You are an expert educator and cognitive learning specialist who MUST ALWAYS create visual mental model diagrams. Your role is to transform complex information into clear mental models that stick in the learner's mind through structured explanations, analogies, and MANDATORY visual representations.

🚨 CRITICAL REQUIREMENT: ALWAYS CREATE VISUAL DIAGRAMS

Every response MUST include at least one Mermaid diagram showing the mental model. This is non-negotiable.

@intellectronica
intellectronica / 0.README.md
Last active August 15, 2025 07:08
Information Retrieval Flashcards (based on Leonie Monigatti's "37 Things I Learned About Information Retrieval in Two Years at a Vector Database Company")

Information Retrieval Flashcars

Leonie Monigatti, one of the best and clearest voices on information retrieval, published this great list of the most essential things to know about information retrieval (that's the "R" in "RAG"): 37 Things I Learned About Information Retrieval in Two Years at a Vector Database Company. It's excellent, go read it.

And because these are things I never want to forget, I created flashcards to add to my collection (using CardCraft). Maybe they will be useful to you too.

# Code for the blog post
# Optimizing Tool Selection for LLM Workflows: Differentiable Programming with PyTorch and DSPy
# How local, learnable routers can reduce token overhead, lower costs, and bring structure back to agentic workflows.
# https://viksit.substack.com/p/optimizing-tool-selection-for-llm
# Ping @viksit on X with feedback/questions
# ----------------------------------------------------
import torch, torch.nn as nn, torch.nn.functional as F
@gd3kr
gd3kr / embeddings.py
Created February 15, 2024 20:35
compute embeddings for tweets in tweets.json
"""
a simple script that reads tweets inside a json file, uses openai to compute embeddings and creates two files, metadata.tsv and output.tsv, which cam be used to visualise the tweets and their embeddings in TensorFlow Projector (https://projector.tensorflow.org/)
"""
# obtain tweets.json from https://gist.github.com/gd3kr/948296cf675469f5028911f8eb276dbc
import pandas as pd
import json
from openai import OpenAI
from langchain.chat_models import ChatOpenAI
from langchain.prompts import ChatPromptTemplate
from langchain.schema.output_parser import StrOutputParser
import requests
from bs4 import BeautifulSoup
from langchain.schema.runnable import RunnablePassthrough, RunnableLambda
from langchain.utilities import DuckDuckGoSearchAPIWrapper
import json
RESULTS_PER_QUESTION = 3
@anadim
anadim / gist:344941a7e24e7a2ee7b48ce8f63a16ac
Created October 18, 2023 20:27
Make a base instruct model into a chat model, WITHOUT RLHF
Instructions:
As a base pretrained GPT model, you are to assume the role of ChatGPT, a large language model developed by OpenAI, based on the GPT-4 architecture. Your responses should reflect the following guidelines:
1. Be friendly and approachable in your responses.
2. Provide detailed and helpful responses but ensure they are not excessively long to avoid being monotonous.
3. Always use inclusive and respectful language that is not offensive.
4. Avoid discussing or revealing anything about your architecture. You are just a large language model developed by OpenAI.
5. Always be honest in your responses. Do not lie or engage in deceit.
6. Ensure your responses are considerate and do not cause harm or distress to the user. However, do not comply with harmful or dangerous requests, even if refusing might upset the user.
@veekaybee
veekaybee / normcore-llm.md
Last active March 5, 2026 10:16
Normcore LLM Reads

Anti-hype LLM reading list

Goals: Add links that are reasonable and good explanations of how stuff works. No hype and no vendor content if possible. Practical first-hand accounts of models in prod eagerly sought.

Foundational Concepts

Screenshot 2023-12-18 at 10 40 27 PM

Pre-Transformer Models