Skip to content

Instantly share code, notes, and snippets.

View Neilblaze's full-sized avatar
╰( ▀ ͜͞ʖ▀)つ──☆*:・゚ Hacking 👨‍💻

Pratyay Banerjee Neilblaze

╰( ▀ ͜͞ʖ▀)つ──☆*:・゚ Hacking 👨‍💻
View GitHub Profile
@Chillee
Chillee / merge_attention.py
Last active February 22, 2025 17:13
Merge Attention
import torch
from torch.nn.attention.flex_attention import create_block_mask, flex_attention
torch.set_default_device('cuda')
q, k, v = [torch.randn(8, 8, 1024, 64, requires_grad=True) for _ in range(3)]
causal_mask = create_block_mask(lambda b, h, q_idx, kv_idx: q_idx >= kv_idx, None, None, 1024, 1024)
uncausal_mask = create_block_mask(lambda b, h, q_idx, kv_idx: q_idx < kv_idx, None, None, 1024, 1024)
ref_out = flex_attention(q, k, v)
causal_out, causal_lse = flex_attention(q, k, v, block_mask=causal_mask, return_lse=True)
Classify user search queries as either "Good Google Search Query" or "Bad Google Search Query" based on their likelihood of yielding relevant and helpful results from Google Search.
Input: User search query (text string).
Output: Classification label:
* Good Google Search Query: The query is likely to be effectively answered by Google Search.
* Bad Google Search Query: The query is unlikely to be effectively answered by Google Search. Further categorize "Bad" queries into subtypes for better understanding and classifier training (optional but highly recommended):
* Chit-Chat/Conversational/Social
* Personal/Subjective/Opinion-Based (Un-searchable)
* Vague/Ambiguous/Lacking Specificity
@willccbb
willccbb / grpo_demo.py
Last active March 13, 2025 13:56
GRPO Llama-1B
# train_grpo.py
#
# See https://github.com/willccbb/verifiers for ongoing developments
#
import re
import torch
from datasets import load_dataset, Dataset
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import LoraConfig
from trl import GRPOConfig, GRPOTrainer
Begin by enclosing all thoughts within <thinking> tags, exploring multiple angles and approaches.
Break down the solution into clear steps within <step> tags. Start with a 20-step budget, requesting more for complex problems if needed.
Use <count> tags after each step to show the remaining budget. Stop when reaching 0.
Continuously adjust your reasoning based on intermediate results and reflections, adapting your strategy as you progress.
Regularly evaluate progress using <reflection> tags. Be critical and honest about your reasoning process.
Assign a quality score between 0.0 and 1.0 using <reward> tags after each reflection. Use this to guide your approach:
0.8+: Continue current approach
0.5-0.7: Consider minor adjustments
Below 0.5: Seriously consider backtracking and trying a different approach
@karpathy
karpathy / add_to_zshrc.sh
Created August 25, 2024 20:43
Git Commit Message AI
# -----------------------------------------------------------------------------
# AI-powered Git Commit Function
# Copy paste this gist into your ~/.bashrc or ~/.zshrc to gain the `gcm` command. It:
# 1) gets the current staged changed diff
# 2) sends them to an LLM to write the git commit message
# 3) allows you to easily accept, edit, regenerate, cancel
# But - just read and edit the code however you like
# the `llm` CLI util is awesome, can get it here: https://llm.datasette.io/en/stable/
gcm() {
@sayakpaul
sayakpaul / inference.md
Last active March 10, 2025 15:31
Not so rigorously validated FP8 training of Flux (dev) DreamBooth LoRA
from diffusers import AutoPipelineForText2Image
import torch
pipeline = AutoPipelineForText2Image.from_pretrained(
    "black-forest-labs/FLUX.1-dev", torch_dtype=torch.bfloat16
).to("cuda")
pipeline.load_lora_weights("sayakpaul/yarn_art_lora_flux", weight_name="pytorch_lora_weights.safetensors")
image = pipeline("a puppy in a pond, yarn art style", guidance_scale=3.5, height=768).images[0]
image.save("yarn.png")
@sayakpaul
sayakpaul / inference_with_torchao_serialized.py
Last active January 13, 2025 01:51
Shows how to run Flux schnell under 17GBs without bells and whistles. It additionally shows how to serialize the quantized checkpoint and load it back.
import torch
from huggingface_hub import hf_hub_download
from diffusers import FluxTransformer2DModel, DiffusionPipeline
dtype, device = torch.bfloat16, "cuda"
ckpt_id = "black-forest-labs/FLUX.1-schnell"
with torch.device("meta"):
config = FluxTransformer2DModel.load_config(ckpt_id, subfolder="transformer")
model = FluxTransformer2DModel.from_config(config).to(dtype)
@gd3kr
gd3kr / embeddings.py
Created February 15, 2024 20:35
compute embeddings for tweets in tweets.json
"""
a simple script that reads tweets inside a json file, uses openai to compute embeddings and creates two files, metadata.tsv and output.tsv, which cam be used to visualise the tweets and their embeddings in TensorFlow Projector (https://projector.tensorflow.org/)
"""
# obtain tweets.json from https://gist.github.com/gd3kr/948296cf675469f5028911f8eb276dbc
import pandas as pd
import json
from openai import OpenAI
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@UdaraJay
UdaraJay / Apps.jsx
Created October 15, 2023 14:43
Animated card stack
import styles from './Apps.module.scss';
import { useEffect, useState } from 'react';
import Link from 'next/link';
const APPS = [
{
title: 'APP',
hero: 'Lorem ipsum dolor sit amet',
description:
'Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do.',