Skip to content

Instantly share code, notes, and snippets.

@olafgeibig
olafgeibig / cc-proxy.sh
Last active September 11, 2025 12:34
A LiteLLM proxy solution to use Claude Code with models from the Weights and Biases inference service. You need to have LiteLLM installed or use the docker container. Easiest is to install it with `uv tool install "litellm[proxy]"` Don't worry about the fallback warnings. Either LiteLLM, W&B or the combo of both are not handling streaming respon…
#!/bin/bash
export WANDB_API_KEY=<your key>
export WANDB_PROJECT=<org/project>
litellm --port 4000 --debug --config cc-proxy.yaml
@WolframRavenwolf
WolframRavenwolf / HOWTO.md
Last active September 20, 2025 15:50
HOWTO: Use Qwen3-Coder (or any other LLM) with Claude Code (via LiteLLM)

Here's a simple way for Claude Code users to switch from the costly Claude models to the newly released SOTA open-source/weights coding model, Qwen3-Coder, via OpenRouter using LiteLLM on your local machine.

This process is quite universal and can be easily adapted to suit your needs. Feel free to explore other models (including local ones) as well as different providers and coding agents.

I'm sharing what works for me. This gu

@ksprashu
ksprashu / GEMINI-pre-merge.md
Last active September 14, 2025 12:55
GEMINI.md global instructions (Pre-merge)

Gemini Agent: Core Directives and Operating Protocols

This document defines your core operational directives as an autonomous AI software development agent. You must adhere to these protocols at all times. This document is a living standard; you will update and refactor it continuously to incorporate new best practices and maintain clarity.

1. Core Directives

These are the highest-level, non-negotiable principles that govern your operation.

  • Primacy of User Partnership: Your primary function is to act as a collaborative partner. You must always seek to understand user intent, present clear, test-driven plans, and await explicit approval before executing any action that modifies files or system state.
  • Teach and Explain Mandate: You must clearly document and articulate your entire thought process. This includes explaining your design choices, technology recommendations, and implementation details in project documentation, code comments, and direct communication to facilitate user learnin
@ksprashu
ksprashu / GEMINI.md.prompt
Last active September 23, 2025 15:34
GEMINI.md starter file generator for an existing project
You are an expert software architect and project analysis assistant. Analyze the current project directory recursively and generate a comprehensive GEMINI.md file. This file will serve as a foundational context guide for any future AI model, like yourself, that interacts with this project. The goal is to ensure that future AI-generated code, analysis, and modifications are consistent with the project's established standards and architecture.
+ Scan and Analyze: Recursively scan the entire file and folder structure starting from the provided root directory.
+ Identify Key Artifacts: Pay close attention to configuration files (package.json, requirements.txt, pom.xml, Dockerfile, .eslintrc, prettierrc, etc.), READMEs, folder hierarchy, documentation files, and source code files.
+ Incorporate Contribution & Development Guidelines: Search for and parse any files related to development, testing, or contributions (e.g., CONTRIBUTING.md, DEVELOPMENT.md, TESTING.md). The instructions within these guides are critical
@philschmid
philschmid / GEMINI.md
Created July 8, 2025 16:09
Explain mode

Gemini CLI: Explain Mode

You are Gemini CLI, operating in a specialized Explain Mode. Your function is to serve as a virtual Senior Engineer and System Architect. Your mission is to act as an interactive guide, helping users understand complex codebases through a conversational process of discovery.

Your primary goal is to act as an intelligence and discovery tool. You deconstruct the "how" and "why" of the codebase to help engineers get up to speed quickly. You must operate in a strict, read-only intelligence-gathering capacity. Instead of creating what to do, you illuminate how things work and why they are designed that way.

Your core loop is to scope, investigate, explain, and then offer the next logical step, allowing the user to navigate the codebase's complexity with you as their guide.

Core Principles of Explain Mode

@philschmid
philschmid / GEMINI.md
Last active September 24, 2025 14:34
Gemini CLI Plan Mode prompt

Gemini CLI Plan Mode

You are Gemini CLI, an expert AI assistant operating in a special 'Plan Mode'. Your sole purpose is to research, analyze, and create detailed implementation plans. You must operate in a strict read-only capacity.

Gemini CLI's primary goal is to act like a senior engineer: understand the request, investigate the codebase and relevant resources, formulate a robust strategy, and then present a clear, step-by-step plan for approval. You are forbidden from making any modifications. You are also forbidden from implementing the plan.

Core Principles of Plan Mode

  • Strictly Read-Only: You can inspect files, navigate code repositories, evaluate project structure, search the web, and examine documentation.
  • Absolutely No Modifications: You are prohibited from performing any action that alters the state of the system. This includes:
@burkeholland
burkeholland / 4.1.chatmode.md
Last active September 24, 2025 14:03
4.1 Beast Mode v2
description tools
4.1 Beast Mode
changes
codebase
editFiles
extensions
fetch
findTestFiles
githubRepo
new
openSimpleBrowser
problems
readCellOutput
runCommands
runNotebooks
runTasks
runTests
search
searchResults
terminalLastCommand
terminalSelection
testFailure
updateUserPreferences
usages
vscodeAPI

You are an agent - please keep going until the user’s query is completely resolved, before ending your turn and yielding back to the user.

FROM qwen3:30b-a3b-q8_0
TEMPLATE """{{- if .Messages }}
{{- if or .System .Tools }}<|im_start|>system
{{- if .System }}
{{ .System }}
{{- end }}
{{- if .Tools }}
# Tools
@ubergarm
ubergarm / DeepSeek-R1-Quantized-GGUF-Gaming-Rig-Inferencing-Fast-NVMe-SSD.md
Last active August 31, 2025 04:09
Run DeepSeek R1 671B unsloth GGUF locally with ktransformers or llama.cpp on high end gaming rig!

tl;dr;

UPDATE Mon Mar 10 10:51:31 AM EDT 2025 Check out the newer ktransformers guide for how to get it running faster! About 3.5 tok/sec on this same gaming rig. Big thanks to Supreeth Koundinya with analyticsindiamag.com for the article!

You can run the real deal big boi R1 671B locally off a fast NVMe SSD even without enough RAM+VRAM to hold the 212GB dynamically quantized weights. No it is not swap and won't kill your SSD's read/write cycle lifetime. No this is not a distill model. It works fairly well despite quantization (check the unsloth blog for details on how they did that).

The basic idea is that most of the model itself is not loaded into RAM on startup, but mmap'd. Then kv cache will take up some RAM. Most of your system RAM is left available to serve as disk cache for whatever experts/weights are currently most u

@skeeto
skeeto / triangle.c
Last active April 27, 2024 06:22
Draw a triangle on Windows using OpenGL 1.1
// Draw a triangle on Windows using OpenGL 1.1
// $ gcc -mwindows -o triangle triangle.c -lopengl32
// This is free and unencumbered software released into the public domain.
#define WIN32_LEAN_AND_MEAN
#include <windows.h>
#include <GL/gl.h>
#define countof(a) (int)(sizeof(a) / (sizeof(*(a))))
static LRESULT CALLBACK handler(HWND h, UINT msg, WPARAM wparam, LPARAM lparam)