Skip to content

Instantly share code, notes, and snippets.

View madebyollin's full-sized avatar

Ollin Boer Bohan madebyollin

View GitHub Profile
@madebyollin
madebyollin / single_pass_superconditioning.md
Last active February 18, 2025 17:01
Single-pass Superconditioning

Single-pass Superconditioning

Motivation

Guided diffusion sampling typically uses two forward passes per step:

  1. One caption-conditional forward pass, to compute E[flow | noisy image, noise level, caption]
  2. One unconditional forward pass, to compute E[flow | noisy image, noise level]

These results are then linearly combined to form a single guided/superconditioned flow prediction.

@madebyollin
madebyollin / useful_nn_concepts.md
Created January 12, 2025 05:35
Useful neural network training concepts (narrow usage, broad applicability)

These useful concepts show up in specific areas of NN-training literature but can be applied pretty broadly.

  1. Non-leaky augmentations: you can add arbitrary augmentations during training, without substantially biasing in-domain performance, by adding a secondary input that tells the network which augmentations were used. This technique shows up the Karras et al image generation papers (ex. https://arxiv.org/pdf/2206.00364) but it's applicable whenever you want good performance on limited data.
  2. Batch-stratified sampling: rather than generating per-sample random numbers with e.g. torch.rand(batch_size), you can use th.randperm(batch_size).add_(th.rand(batch_size)).div_(batch_size) instead, which has the same distribution but lower variance, and therefore trains more stably. This shows up in k-diffusion https://github.com/crowsonkb/k-diffusion/commit/a2b7b5f1ea0d3711a06661ca9e41b4e6089e5707, but it's applicable whenever you're randomizing data across the batch axis.
  3. Replay buffers: when y
@madebyollin
madebyollin / dc_ae_review.md
Last active December 8, 2024 20:02
Reviewing the claims of DC-AE

Reviewing the Claims of DC-AE

TL;DR - I think the paper is a good contribution and basically holds up, but Figure 2 seems suspicious and the released repo doesn't include the pieces (AE training code and pretrained 4096-element AEs) that would be needed to make DC-AE practically competitive with SD/SDXL VAEs.


DC-AE is an MIT / Tsinghua / NVIDIA paper about improving generative autoencoders (like the SD VAE) under the high-spatial-compression ratio regime.

@madebyollin
madebyollin / mysterious_bugs.md
Created October 18, 2024 15:11
mysterious_bugs.md

mysterious software bugs I encounter, like, daily

  • if I add/destroy enough video tags (i.e. in periodic output display from a long-running jupyter notebook) Safari will eventually stop rendering all video tags across all pages until I reboot the browser
  • if I interleave autocast and no_grad scopes in the wrong way (I think - autocast with no_grad inside followed by not-no-grad stuff?) pytorch will silently disable grad for the entire autocast scope
  • if I edit colab notebooks (specific ones? on a specific connection? idk), I intermittently get "this notebook has been modified" conflict-resolution dialogs despite being the only editor of the notebook
  • if I don't manually disable the collaborative extension on jupyterlab, it will intermittently teleport my cursor focus back to the first cell in the notebook

need to eventually find repro instructions and chase all of these down to fix

@madebyollin
madebyollin / stable_notebook_diffusers_model_previewing_hacks.py
Created August 31, 2024 13:53
Hacks for stable (non-flickery) preview demo of diffusers FLUX.1 model in jupyter notebooks
from IPython.display import HTML
def get_pred_original_sample(sched, model_output, timestep, sample):
return sample - sched.sigmas[(sched.timesteps == timestep).nonzero().item()] * model_output
# TODO: fix awful globals
prev_img_str = None
def pil_to_html(pil_img, h=IM_HEIGHT, w=IM_WIDTH):
global prev_img_str

Objective

Informal (vibes-based) evaluation of the following vision-language-model captioners:

  • Florence-2-base-ft
  • CogVLM2
  • BLIP-2
  • MoonDream2
  • Share-Captioner
  • Florence-2-SD3-Captioner
@madebyollin
madebyollin / sd3_gradio_demo_with_taesd_preview.py
Created June 15, 2024 23:19
A quick hacked version of the sd3 gradio UI that has live previews (via TAESD3)
#!/usr/bin/env python3
import gradio as gr
import numpy as np
import random
import torch
from diffusers import (
StableDiffusion3Pipeline,
SD3Transformer2DModel,
FlowMatchEulerDiscreteScheduler,
AutoencoderTiny,
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@madebyollin
madebyollin / automatic_profiling_markers.py
Created February 27, 2024 02:57
Add human-readable profiling markers to a pytorch module
def add_profiling_markers(model):
"""Monkey-patch profiling markers into an nn.Module.
Args:
model: an nn.Module
Effect:
all model.named_module() forward calls get wrapped in their
own profiling scope, making traces easier to understand.
"""
@madebyollin
madebyollin / Mamba_Diffusion_IADB_Colab.ipynb
Created December 6, 2023 04:47
Mamba Diffusion (IADB)
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.