Skip to content

Instantly share code, notes, and snippets.

FROM qwen3:30b-a3b-q8_0
TEMPLATE """{{- if .Messages }}
{{- if or .System .Tools }}<|im_start|>system
{{- if .System }}
{{ .System }}
{{- end }}
{{- if .Tools }}
# Tools
@ubergarm
ubergarm / DeepSeek-R1-Quantized-GGUF-Gaming-Rig-Inferencing-Fast-NVMe-SSD.md
Last active April 17, 2025 16:55
Run DeepSeek R1 671B unsloth GGUF locally with ktransformers or llama.cpp on high end gaming rig!

tl;dr;

UPDATE Mon Mar 10 10:51:31 AM EDT 2025 Check out the newer ktransformers guide for how to get it running faster! About 3.5 tok/sec on this same gaming rig. Big thanks to Supreeth Koundinya with analyticsindiamag.com for the article!

You can run the real deal big boi R1 671B locally off a fast NVMe SSD even without enough RAM+VRAM to hold the 212GB dynamically quantized weights. No it is not swap and won't kill your SSD's read/write cycle lifetime. No this is not a distill model. It works fairly well despite quantization (check the unsloth blog for details on how they did that).

The basic idea is that most of the model itself is not loaded into RAM on startup, but mmap'd. Then kv cache will take up some RAM. Most of your system RAM is left available to serve as disk cache for whatever experts/weights are currently most u

@skeeto
skeeto / triangle.c
Last active April 27, 2024 06:22
Draw a triangle on Windows using OpenGL 1.1
// Draw a triangle on Windows using OpenGL 1.1
// $ gcc -mwindows -o triangle triangle.c -lopengl32
// This is free and unencumbered software released into the public domain.
#define WIN32_LEAN_AND_MEAN
#include <windows.h>
#include <GL/gl.h>
#define countof(a) (int)(sizeof(a) / (sizeof(*(a))))
static LRESULT CALLBACK handler(HWND h, UINT msg, WPARAM wparam, LPARAM lparam)
@thesamesam
thesamesam / xz-backdoor.md
Last active May 27, 2025 17:55
xz-utils backdoor situation (CVE-2024-3094)

FAQ on the xz-utils backdoor (CVE-2024-3094)

This is a living document. Everything in this document is made in good faith of being accurate, but like I just said; we don't yet know everything about what's going on.

Update: I've disabled comments as of 2025-01-26 to avoid everyone having notifications for something a year on if someone wants to suggest a correction. Folks are free to email to suggest corrections still, of course.

Background

@skeeto
skeeto / persona.c
Last active March 25, 2024 06:51
Playing around with a little database
// $ cc -o persona persona.c
// $ ./persona <test.txt
// Ref: https://old.reddit.com/r/C_Programming/comments/1bmfb7p
#include <stddef.h>
#include <stdint.h>
#include <string.h>
#define assert(c) while (!(c)) *(volatile int *)0 = 0
#define countof(a) (ptrdiff_t)(sizeof(a) / sizeof(*(a)))
#define new(a, t, n) (t *)alloc(a, sizeof(t), _Alignof(t), n)
@Artefact2
Artefact2 / README.md
Last active May 25, 2025 11:39
GGUF quantizations overview
@alexweberk
alexweberk / mlx_finetuning_gemma.ipynb
Last active May 1, 2025 18:53
MLX Fine-tuning Google Gemma
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@skeeto
skeeto / demo.c
Last active March 8, 2024 03:11
Font rendering demo in SDL2
// Font rendering demo
// $ cc -o demo demo.c $(sdl2-config --cflags --libs)
// Ref: https://old.reddit.com/r/C_Programming/comments/13ga82a
// This is free and unencumbered software released into the public domain.
#include "SDL.h"
// https://itch.io/jam/lowrezjam2016/topic/19413/minimal-sprite-font-with-upperlower-cases-cleanreadable
#define FONTW 72
#define FONTH 143
#define CHARW 9
@liviaerxin
liviaerxin / README.md
Last active May 31, 2025 11:07
FastAPI and Uvicorn Logging #python #fastapi #uvicorn #logging

FastAPI and Uvicorn Logging

When running FastAPI app, all the logs in console are from Uvicorn and they do not have timestamp and other useful information. As Uvicorn applies python logging module, we can override Uvicorn logging formatter by applying a new logging configuration.

Meanwhile, it's able to unify the your endpoints logging with the Uvicorn logging by configuring all of them in the config file log_conf.yaml.

Before overriding:

uvicorn main:app --reload
@yaauie
yaauie / logstash-to-logstash-over-http.md
Created September 6, 2022 15:30
2022 high-level docs for logstash-to-logstash using the HTTP input/output pair

We have had some success using LS-to-LS over HTTP(S), which supports an HTTP(s) Load Balancer or Proxy in the middle, and can be secured with TLS/SSL. It can be made to be quite performant, but doing so requires some specific tuning.

Upstream (HTTP Output)

The upstream pipelie would contain a single HTTP output plugin aimed either directly at a downstream Logstash or at a Load Balancer, importantly configured with:

  • format => json_batch (for performance; without this one event will be sent at a time) and
  • retry_non_idempotent => true (for resilience; without this, some failures cannot be safely retried).

Depending on whether we ar sending directly to another Logstash or through an SSL-terminating Load Balancer or proxy, the output may need to be configured

  • with HTTP Basic credentials (user/password),