Neilblaze’s gists

Chillee / merge_attention.py

Last active February 22, 2025 17:13

Merge Attention

	import torch
	from torch.nn.attention.flex_attention import create_block_mask, flex_attention
	torch.set_default_device('cuda')

	q, k, v = [torch.randn(8, 8, 1024, 64, requires_grad=True) for _ in range(3)]

	causal_mask = create_block_mask(lambda b, h, q_idx, kv_idx: q_idx >= kv_idx, None, None, 1024, 1024)
	uncausal_mask = create_block_mask(lambda b, h, q_idx, kv_idx: q_idx < kv_idx, None, None, 1024, 1024)
	ref_out = flex_attention(q, k, v)
	causal_out, causal_lse = flex_attention(q, k, v, block_mask=causal_mask, return_lse=True)

philschmid / search_query_instructions.txt

Created January 26, 2025 20:52

	Classify user search queries as either "Good Google Search Query" or "Bad Google Search Query" based on their likelihood of yielding relevant and helpful results from Google Search.

	Input: User search query (text string).

	Output: Classification label:
	* Good Google Search Query: The query is likely to be effectively answered by Google Search.
	* Bad Google Search Query: The query is unlikely to be effectively answered by Google Search. Further categorize "Bad" queries into subtypes for better understanding and classifier training (optional but highly recommended):
	* Chit-Chat/Conversational/Social
	* Personal/Subjective/Opinion-Based (Un-searchable)
	* Vague/Ambiguous/Lacking Specificity

willccbb / grpo_demo.py

Last active March 13, 2025 13:56

GRPO Llama-1B

	# train_grpo.py
	#
	# See https://github.com/willccbb/verifiers for ongoing developments
	#
	import re
	import torch
	from datasets import load_dataset, Dataset
	from transformers import AutoTokenizer, AutoModelForCausalLM
	from peft import LoraConfig
	from trl import GRPOConfig, GRPOTrainer

philschmid / prompt.txt

Last active March 10, 2025 05:00

	Begin by enclosing all thoughts within <thinking> tags, exploring multiple angles and approaches.
	Break down the solution into clear steps within <step> tags. Start with a 20-step budget, requesting more for complex problems if needed.
	Use <count> tags after each step to show the remaining budget. Stop when reaching 0.
	Continuously adjust your reasoning based on intermediate results and reflections, adapting your strategy as you progress.
	Regularly evaluate progress using <reflection> tags. Be critical and honest about your reasoning process.
	Assign a quality score between 0.0 and 1.0 using <reward> tags after each reflection. Use this to guide your approach:

	0.8+: Continue current approach
	0.5-0.7: Consider minor adjustments
	Below 0.5: Seriously consider backtracking and trying a different approach

karpathy / add_to_zshrc.sh

Created August 25, 2024 20:43

Git Commit Message AI

	# -----------------------------------------------------------------------------
	# AI-powered Git Commit Function
	# Copy paste this gist into your ~/.bashrc or ~/.zshrc to gain the `gcm` command. It:
	# 1) gets the current staged changed diff
	# 2) sends them to an LLM to write the git commit message
	# 3) allows you to easily accept, edit, regenerate, cancel
	# But - just read and edit the code however you like
	# the `llm` CLI util is awesome, can get it here: https://llm.datasette.io/en/stable/

	gcm() {

sayakpaul / inference.md

Last active March 10, 2025 15:31

Not so rigorously validated FP8 training of Flux (dev) DreamBooth LoRA

from diffusers import AutoPipelineForText2Image
import torch
pipeline = AutoPipelineForText2Image.from_pretrained(
    "black-forest-labs/FLUX.1-dev", torch_dtype=torch.bfloat16
).to("cuda")
pipeline.load_lora_weights("sayakpaul/yarn_art_lora_flux", weight_name="pytorch_lora_weights.safetensors")
image = pipeline("a puppy in a pond, yarn art style", guidance_scale=3.5, height=768).images[0]
image.save("yarn.png")

sayakpaul / inference_with_torchao_serialized.py

Last active January 13, 2025 01:51

Shows how to run Flux schnell under 17GBs without bells and whistles. It additionally shows how to serialize the quantized checkpoint and load it back.

	import torch
	from huggingface_hub import hf_hub_download
	from diffusers import FluxTransformer2DModel, DiffusionPipeline

	dtype, device = torch.bfloat16, "cuda"
	ckpt_id = "black-forest-labs/FLUX.1-schnell"

	with torch.device("meta"):
	config = FluxTransformer2DModel.load_config(ckpt_id, subfolder="transformer")
	model = FluxTransformer2DModel.from_config(config).to(dtype)

gd3kr / embeddings.py

Created February 15, 2024 20:35

compute embeddings for tweets in tweets.json


	"""
	a simple script that reads tweets inside a json file, uses openai to compute embeddings and creates two files, metadata.tsv and output.tsv, which cam be used to visualise the tweets and their embeddings in TensorFlow Projector (https://projector.tensorflow.org/)
	"""

	# obtain tweets.json from https://gist.github.com/gd3kr/948296cf675469f5028911f8eb276dbc

	import pandas as pd
	import json
	from openai import OpenAI

titu1994 / fast_conformer_nemo.ipynb

Last active September 11, 2024 20:46

Sorry, something went wrong. Reload?

Sorry, we cannot display this file.

Sorry, this file is invalid so it cannot be displayed.

UdaraJay / Apps.jsx

Created October 15, 2023 14:43

Animated card stack

	import styles from './Apps.module.scss';
	import { useEffect, useState } from 'react';
	import Link from 'next/link';

	const APPS = [
	{
	title: 'APP',
	hero: 'Lorem ipsum dolor sit amet',
	description:
	'Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do.',

Pratyay Banerjee Neilblaze