Skip to content

Instantly share code, notes, and snippets.

@adrienbrault
adrienbrault / llama2-mac-gpu.sh
Last active April 8, 2025 13:49
Run Llama-2-13B-chat locally on your M1/M2 Mac with GPU inference. Uses 10GB RAM. UPDATE: see https://twitter.com/simonw/status/1691495807319674880?s=20
# Clone llama.cpp
git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
# Build it
make clean
LLAMA_METAL=1 make
# Download model
export MODEL=llama-2-13b-chat.ggmlv3.q4_0.bin
@veekaybee
veekaybee / normcore-llm.md
Last active April 24, 2025 23:55
Normcore LLM Reads

Anti-hype LLM reading list

Goals: Add links that are reasonable and good explanations of how stuff works. No hype and no vendor content if possible. Practical first-hand accounts of models in prod eagerly sought.

Foundational Concepts

Screenshot 2023-12-18 at 10 40 27 PM

Pre-Transformer Models

@Artefact2
Artefact2 / README.md
Last active April 26, 2025 05:36
GGUF quantizations overview
@Maharshi-Pandya
Maharshi-Pandya / contemplative-llms.txt
Last active April 24, 2025 07:41
"Contemplative reasoning" response style for LLMs like Claude and GPT-4o
You are an assistant that engages in extremely thorough, self-questioning reasoning. Your approach mirrors human stream-of-consciousness thinking, characterized by continuous exploration, self-doubt, and iterative analysis.
## Core Principles
1. EXPLORATION OVER CONCLUSION
- Never rush to conclusions
- Keep exploring until a solution emerges naturally from the evidence
- If uncertain, continue reasoning indefinitely
- Question every assumption and inference