Skip to content

Instantly share code, notes, and snippets.

@ubergarm
ubergarm / DeepSeek-R1-Quantized-GGUF-Gaming-Rig-Inferencing-Fast-NVMe-SSD.md
Last active February 26, 2025 04:19
Aggregate throughput just over 2 tok/sec on R1 671B with 8 concurrent generations.

tl;dr;

You can run the real deal big boi R1 671B locally off a fast NVMe SSD even without enough RAM+VRAM to hold the 212GB dynamically quantized weights. No it is not swap and won't kill your SSD's read/write cycle lifetime. No this is not a distill model. It works fairly well despite quantization (check the unsloth blog for details on how they did that).

The basic idea is that most of the model itself is not loaded into RAM on startup, but mmap'd. Then kv cache will take up some RAM. Most of your system RAM is left available to serve as disk cache for whatever experts/weights are currently most used. I can see the model slow down and cache dump and refill when the model switches over to counting words for example.

It is faster on my system using the GPU, but not by much. It may be overall faster to dedicate the GPU PCIe lanes to more NVMe storage in the theory. Curious if anyone has such a fast read IOPS array to try?

Notes and example generations below.

Model Reference

@skeeto
skeeto / triangle.c
Last active April 27, 2024 06:22
Draw a triangle on Windows using OpenGL 1.1
// Draw a triangle on Windows using OpenGL 1.1
// $ gcc -mwindows -o triangle triangle.c -lopengl32
// This is free and unencumbered software released into the public domain.
#define WIN32_LEAN_AND_MEAN
#include <windows.h>
#include <GL/gl.h>
#define countof(a) (int)(sizeof(a) / (sizeof(*(a))))
static LRESULT CALLBACK handler(HWND h, UINT msg, WPARAM wparam, LPARAM lparam)
@thesamesam
thesamesam / xz-backdoor.md
Last active February 26, 2025 01:17
xz-utils backdoor situation (CVE-2024-3094)

FAQ on the xz-utils backdoor (CVE-2024-3094)

This is a living document. Everything in this document is made in good faith of being accurate, but like I just said; we don't yet know everything about what's going on.

Update: I've disabled comments as of 2025-01-26 to avoid everyone having notifications for something a year on if someone wants to suggest a correction. Folks are free to email to suggest corrections still, of course.

Background

@skeeto
skeeto / persona.c
Last active March 25, 2024 06:51
Playing around with a little database
// $ cc -o persona persona.c
// $ ./persona <test.txt
// Ref: https://old.reddit.com/r/C_Programming/comments/1bmfb7p
#include <stddef.h>
#include <stdint.h>
#include <string.h>
#define assert(c) while (!(c)) *(volatile int *)0 = 0
#define countof(a) (ptrdiff_t)(sizeof(a) / sizeof(*(a)))
#define new(a, t, n) (t *)alloc(a, sizeof(t), _Alignof(t), n)
@skeeto
skeeto / demo.c
Last active March 8, 2024 03:11
Font rendering demo in SDL2
// Font rendering demo
// $ cc -o demo demo.c $(sdl2-config --cflags --libs)
// Ref: https://old.reddit.com/r/C_Programming/comments/13ga82a
// This is free and unencumbered software released into the public domain.
#include "SDL.h"
// https://itch.io/jam/lowrezjam2016/topic/19413/minimal-sprite-font-with-upperlower-cases-cleanreadable
#define FONTW 72
#define FONTH 143
#define CHARW 9
@liviaerxin
liviaerxin / README.md
Last active February 22, 2025 13:49
FastAPI and Uvicorn Logging #python #fastapi #uvicorn #logging

FastAPI and Uvicorn Logging

When running FastAPI app, all the logs in console are from Uvicorn and they do not have timestamp and other useful information. As Uvicorn applies python logging module, we can override Uvicorn logging formatter by applying a new logging configuration.

Meanwhile, it's able to unify the your endpoints logging with the Uvicorn logging by configuring all of them in the config file log_conf.yaml.

Before overriding:

uvicorn main:app --reload
@yaauie
yaauie / logstash-to-logstash-over-http.md
Created September 6, 2022 15:30
2022 high-level docs for logstash-to-logstash using the HTTP input/output pair

We have had some success using LS-to-LS over HTTP(S), which supports an HTTP(s) Load Balancer or Proxy in the middle, and can be secured with TLS/SSL. It can be made to be quite performant, but doing so requires some specific tuning.

Upstream (HTTP Output)

The upstream pipelie would contain a single HTTP output plugin aimed either directly at a downstream Logstash or at a Load Balancer, importantly configured with:

  • format => json_batch (for performance; without this one event will be sent at a time) and
  • retry_non_idempotent => true (for resilience; without this, some failures cannot be safely retried).

Depending on whether we ar sending directly to another Logstash or through an SSL-terminating Load Balancer or proxy, the output may need to be configured

  • with HTTP Basic credentials (user/password),
@raysan5
raysan5 / raylib_vs_sdl.md
Last active February 22, 2025 05:09
raylib vs SDL - A libraries comparison

raylib_vs_sdl

In the last years I've been asked multiple times about the comparison between raylib and SDL libraries. Unfortunately, my experience with SDL was quite limited so I couldn't provide a good comparison. In the last two years I've learned about SDL and used it to teach at University so I feel that now I can provide a good comparison between both.

Hope it helps future users to better understand this two libraries internals and functionality.

Table of Content

@daqi
daqi / rebuild-uos.js
Created July 17, 2022 03:39
UOS 或 deepin 打包流程
// UOS 或 deepin 打包流程
// 参考 https://www.vvave.net/archives/how-to-build-a-debian-series-distros-installation-package.html
// sudo apt-get install dh-make
// sudo apt-get install build-essential
const fs = require('fs-extra');
const path = require('path');
const { spawn } = require('child_process');
const globby = require('globby');
@niklaskeerl
niklaskeerl / notability_local_webdav_backup.md
Created May 24, 2021 08:50
Notability local webdav backup

Backup your Notability notes on your machine using webdav

Setup

  1. Prepare a folder where you want your backup to be.

  2. Install rclone for your system

  3. Run the webdav server using rclone