Skip to content

Instantly share code, notes, and snippets.

View awni's full-sized avatar

Awni Hannun awni

View GitHub Profile
@awni
awni / mlx_dgx_spark.md
Last active October 17, 2025 04:56
MLX and MLX LM on DGX Spark

Install CUDA deps:

sudo apt-get update
sudo apt-get install libcudnn9-dev-cuda-13
sudo apt-get install libblas-dev liblapack-dev liblapacke-dev
sudo apt-get install libnccl2 libnccl-dev

Install MLX:

@awni
awni / mlx_lm_benchmarks.md
Last active October 14, 2025 09:43
MLX LM Benchmarks

Benchmarks for mlx-lm

The command for evaluating on MMLU Pro:

mlx_lm.evaluate --model model/repo --task mmlu_pro

The command for efficiency benchmarks:

@awni
awni / tiled_matmul.py
Last active October 9, 2025 19:19
MLX Tiled Matmul
import mlx.core as mx
# Possible tile size for tensor cores
TS = 32
# Matrix dimension (M = N = K = D)
D = 2048
A = mx.random.uniform(shape=(D, D))
B = mx.random.uniform(shape=(D, D))
import math
import time
from functools import partial
import mlx.core as mx
import mlx.nn as nn
import mlx.optimizers as optim
import numpy as np
from mlx.utils import tree_flatten
@awni
awni / mem.py
Last active September 23, 2025 06:01
Remember with MLX LM
import argparse
import copy
import mlx.core as mx
from pathlib import Path
from mlx_lm import load, stream_generate
from mlx_lm.generate import generate_step
from mlx_lm.models.cache import make_prompt_cache
DEFAULT_MAX_TOKENS = 2048
@awni
awni / PROMPT.md
Last active June 25, 2025 14:58
MLX LM with Tiny Agents

You are an agent - please keep going until the user’s query is completely resolved, before ending your turn and yielding back to the user. Only terminate your turn when you are sure that the problem is solved, or if you need more info from the user to solve the problem. If you are not sure about anything pertaining to the user’s request, use your tools to read files and gather the relevant information: do NOT guess or make up an answer. You MUST plan extensively before each function call, and reflect extensively on the outcomes of the previous function calls. DO NOT do this entire process by making function calls only, as this can impair your ability to solve the problem and think insightfully.

@awni
awni / mlx_lm_openai.md
Created June 7, 2025 19:25
MLX LM + OpenAI Client

First install the dependencies:

pip install mlx-lm openai

Then start the server:

mlx_lm.server
import argparse
import math
import mlx.core as mx
import mlx.nn as nn
from tqdm import tqdm
from mlx_lm.utils import load
from pathlib import Path
def eval_ppl(model, data, batch_size=32):
class GLU: Module, UnaryLayer {
let dim: Int
init(dim: Int) {
self.dim = dim
}
func callAsFunction(_ x: MLXArray) -> MLXArray {
let (a, b) = x.split(axis: dim)
return a * MLXNN.sigmoid(b)
@awni
awni / mlx_lm_open_webui.md
Created April 25, 2025 15:41
Open WebUI with MLX LM

Setup

Install packages:

pip install open-webui mlx-lm

Start Open WebUI server: