torridgristle’s gists

torridgristle / kaiser_lowpass.py

Last active February 28, 2022 14:25

Kaiser Filter Lowpass Module for PyTorch. Torchvision's gaussian blur uses the "reflect" padding mode but I'm not sure if that makes sense so I've set it for "replicate" by default.

	import torch
	import torch.nn as nn
	import torch.nn.functional as F

	device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

	class KaiserLowpass(nn.Module):
	def __init__(self, width=7, beta=11, periodic=False, padding_mode='replicate'):
	super().__init__()
	self.padding_mode = padding_mode

torridgristle / residual_concatenate.py

Created February 28, 2022 14:18

Residual and Concatenate modules for PyTorch. Residual taken from https://github.com/lucidrains/denoising-diffusion-pytorch/blob/eb6e1b508e920efd840f2fa13e80d60ce19ed2c1/denoising_diffusion_pytorch/denoising_diffusion_pytorch.py#L72 since it was very useful.

	import torch
	import torch.nn as nn

	class Residual(nn.Module):
	def __init__(self, fn):
	super().__init__()
	self.fn = fn
	def forward(self, x, args, *kwargs):
	return self.fn(x, args, *kwargs) + x

torridgristle / sobel_scharr_farid_modules.py

Created February 28, 2022 14:25

Sobel and Farid edge detection modules for PyTorch. Option for using Scharr kernel instead of Sobel is enabled by default and has better rotational symmetry.

	import torch
	import torch.nn as nn
	import torch.nn.functional as F

	device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

	class Sobel(nn.Module):
	def __init__(self,structure=False,scharr=True, padding_mode='reflect'):
	super().__init__()
	self.structure = structure

torridgristle / vqgan_dec_skip_lores_attn.py

Last active July 26, 2022 20:24

VQGAN F8 Decoding with downscaled attention

	def vqgan_dec_skip_lores_attn(h, temb=None):
	# middle
	h = vqgan.decoder.mid.block_1(h, temb)
	h_half = F.upsample(h,scale_factor=0.5,mode='bicubic',align_corners=False)
	h_half = vqgan.decoder.mid.attn_1(h_half) - h_half
	h_half = F.upsample(h_half,scale_factor=2,mode='bicubic',align_corners=False)
	h = h + h_half
	h = vqgan.decoder.mid.block_2(h, temb)

	# upsampling

torridgristle / Max Smooth Unpooling.py

Created August 3, 2022 13:43

Max Pool 2d Unpooling

	# Perform max pool 2d with indicies on a tensor
	max_size = 8
	max_output, max_indices = F.max_pool2d_with_indices(input_tensor,max_size)

	# Unpool it to get a tensor of the original size with zeros in all non-max areas
	max_unpool = F.max_unpool2d(max_output,max_indices,max_size,max_size)
	# Unpool it using a tensor of ones with the same indices to get ones where the tensor was sampled
	max_mask = F.max_unpool2d(torch.ones_like(max_output),max_indices,max_size,max_size)

	# Makes a kernel that's round and the distance from the center

torridgristle / depth_map_blur.py

Created August 24, 2022 18:58

Blur an image with a depth map in PyTorch. Splits the map into ranges of values, multiplies the image by those ranges, blurs them and the split map, sums all the blurred images and blurred maps together, divide blurred image sum by blurred map sum.

	#1 is end and 0 is start in the map.
	def map_blur(img,map,s_start=0.375,s_end=8,steps=8):
	img_slices = img * 0
	map_slices = map * 0
	for s in range(steps):
	sigma = (s/(steps-1)) * (s_end-s_start) + s_start
	slice_start = (s+0)/steps
	slice_end = (s+1)/steps
	map_slice = torch.logical_and(
	torch.greater_equal(map,slice_start),

torridgristle / CLIP ViT-L14 token embedding clusters.txt

Created August 29, 2022 18:22

I used CLIP ViT-L/14 token embeddings for tokens that were ASCII-only, ended in </w> meaning it wasn't a prefix (but doesn't garuntee it's a full word), started with a letter, had 3 or more letters, and wasn't the end/start tokens. Due to memory constraints I had to break it down to a smaller number of channels so it's using the PCA of these, re…

	I used CLIP ViT-L/14 token embeddings for tokens that were ASCII-only, ended in </w> meaning it wasn't a prefix (but doesn't garuntee it's a full word), started with a letter, had 3 or more letters, and wasn't the end/start tokens.
	Due to memory constraints I had to break it down to a smaller number of channels so it's using the PCA of these, reducing 768 channels to 512. From there, kmeans using https://github.com/subhadarship/kmeans_pytorch with 128 clusters and cosine distance within pytorch's autocast wrapper in the hopes of saving from memory.
	The cluster IDs from it were then used to average the non-PCA tokens, so the PCA was only affecting the clustering but not the actual values of the clusters. The averaging was done in .half() precision for memory.
	The words listed are the 64 best matching tokens (cosine similarity) for each cluster center.

	Cluster 0 :

	marker
	markers
	signaling

torridgristle / prompt_mass_encoding_randomization.py

Last active October 1, 2022 14:21

Generate every combination of prompt parts, encode all of the prompts in batches to avoid running out of memory. Alternatively only keep the min/max channel values and min/max token norms and randomly generate prompts with randn noise. Intended for Stable Diffusion but can be used for anything with CLIP by just swapping out the model.get_learned…

	import itertools

	def prompt_combinations(prompt_parts):
	'''
	Provide a list of lists of prompt parts, like:
	[ ["A ","An "], ["anteater","feather duster"] ]
	'''
	opt_prompt = list(itertools.product(*prompt_parts, repeat=1))
	opt_prompt = [''.join(opt_prompt[b]) for b in range(len(opt_prompt))]
	return opt_prompt

torridgristle / Denoiser Slew Limiter.py

Created October 12, 2022 15:39

Stable Diffusion CFGDenoiser with slew limiting and frequency splitting for detail preservation as an option.

	import torch
	import torch.nn as nn
	import torchvision.transforms.functional as TF

	1class CFGDenoiserSlew(nn.Module):
	'''
	Clamps the maximum change each step can have.
	"limit" is the clamp bounds. 0.4-0.8 seem good, 1.6 and 3.2 have very little difference and might represent the upper bound of values.
	"blur" is the radius of a gaussian blur used to split the limited output with the original output in an attempt to preserve detail and color.
	"last_step_is_blur" if true will compare the model output to the blur-split output rather than just the limited output, can look nicer.