Skip to content

Instantly share code, notes, and snippets.

View abodacs's full-sized avatar

Abdullah Mohammed abodacs

View GitHub Profile
@cpfiffer
cpfiffer / thinking-cap.py
Created January 22, 2025 18:11
Limit the number of characters DeepSeek R1 can use for thinking.
import outlines
from transformers import AutoTokenizer
model_string = 'deepseek-ai/DeepSeek-R1-Distill-Qwen-7B'
# model_string = 'deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B' # For small machines
model = outlines.models.transformers(
model_string,
device='cuda', # also 'cpu', 'mps','auto'
)
@philschmid
philschmid / get_memory_size.py
Created January 16, 2025 13:53
Get needed GPU per precision for a Hugging Face Model Id
from typing import Dict, Union
from huggingface_hub import get_safetensors_metadata
import argparse
import sys
# Example:
# python get_gpu_memory.py Qwen/Qwen2.5-7B-Instruct
# Dictionary mapping dtype strings to their byte sizes
bytes_per_dtype: Dict[str, float] = {
@cpfiffer
cpfiffer / highlights.py
Created January 15, 2025 21:17
Extract highlights from a piece of content.
import outlines
from pydantic import BaseModel
from transformers import AutoTokenizer
from rich import print
model_string = "HuggingFaceTB/SmolLM2-1.7B-Instruct"
model = outlines.models.transformers(model_string)
tokenizer = AutoTokenizer.from_pretrained(model_string)
class Highlights(BaseModel):
#!/usr/bin/env python3
"""
Human quality transcripts from audio files using
AssemblyAI for transcription and Google's Gemini for enhancement.
Requirements:
- AssemblyAI API key (https://www.assemblyai.com/)
- Google API key (https://aistudio.google.com/)
- Python packages: assemblyai, google-generativeai, pydub
@charlesfrye
charlesfrye / wrapper.py
Last active February 24, 2025 16:16
Train GPT-2 in five minutes -- for free!
# Train GPT-2 in five minutes -- for free
#
# ```bash
# pip install modal
# modal setup
# modal run wrapper.py
# ```
#
# Note that the end-to-end latency the first time is more like 25 minutes:
# - five minutes to install Torch (rip)
@virattt
virattt / hedge-fund-agent-team-v1-3.ipynb
Last active March 28, 2025 05:53
hedge-fund-agent-team-v1-3.ipynb
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@andrasbacsai
andrasbacsai / firewall.sh
Last active April 19, 2025 14:31
Update a Hetzner Firewall rule with your IP address
#!/bin/bash
# Script to update a firewall rule in a Hetzner Firewall with your current IP address.
# Good if you would like to restrict SSH access only for your current IP address (secure).
#################
# WARNING: This script will overwrite all rules in the firewall rules, so make sure you
# added all the required rules.
# I use a separate firewall rule just for SSH access.
#################
@sayakpaul
sayakpaul / inference.md
Last active February 5, 2025 14:13
(Not so rigrously tested) example showing how to use `bitsandbytes`, `peft`, etc. to LoRA fine-tune Flux.1 Dev.

When loading the LoRA params (that were obtained on a quantized base model) and merging them into the base model, it is recommended to first dequantize the base model, merge the LoRA params into it, and then quantize the model again. This is because merging into 4bit quantized models can lead to some rounding errors. Below, we provide an end-to-end example:

  1. First, load the original model and merge the LoRA params into it:
from diffusers import FluxPipeline 
import torch 

ckpt_id = "black-forest-labs/FLUX.1-dev"
pipeline = FluxPipeline.from_pretrained(
@kwindla
kwindla / talk.py
Created October 7, 2024 01:59
command-line openai realtime
import asyncio
import base64
import json
import os
import pyaudio
import shutil
import websockets
class AudioStreamer:
var $debugHelper = $debugHelper || {};
$debugHelper = function () {
var href = "lib/debugger.css";
var addCss = function () {
if (styleStyleIsLoaded(href) === true) {
return;
}
const head = document.head;
const link = document.createElement("link");
link.type = "text/css";