Tao Hu dongzhuoyao

Notes / Links about Stable Diffusion VAE

Stable Diffusion's VAE is a neural network that encodes images into a compressed "latent" format and decodes them back. The encoder performs 48x lossy compression, and the decoder generates new detail to fill in the gaps.

(Calling this model a "VAE" is sort of a misnomer - it's an encoder with some very slight KL regularization, and a conditional GAN decoder)

This document is a big pile of various links with more info.

Multi-node-training on slurm with PyTorch

What's this?

A simple note for how to start multi-node-training on slurm scheduler with PyTorch.
Useful especially when scheduler is too busy that you cannot get multiple GPUs allocated, or you need more than 4 GPUs for a single job.
Requirement: Have to use PyTorch DistributedDataParallel(DDP) for this purpose.
Warning: might need to re-factor your own code.
Warning: might be secretly condemned by your colleagues because using too many GPUs.

注意：本文内容适用于 Tmux 2.3 及以上的版本，但是绝大部分的特性低版本也都适用，鼠标支持、VI 模式、插件管理在低版本可能会与本文不兼容。

Tmux 快捷键 & 速查表 & 简明教程

启动新会话：

tmux [new -s 会话名 -n 窗口名]

恢复会话：

##VGG16 model for Keras

This is the Keras model of the 16-layer network used by the VGG team in the ILSVRC-2014 competition.

It has been obtained by directly converting the Caffe model provived by the authors.

Details about the network architecture can be found in the following arXiv paper:

Very Deep Convolutional Networks for Large-Scale Image Recognition

K. Simonyan, A. Zisserman

	from torch import FloatTensor, LongTensor, Tensor, Size, lerp, zeros_like
	from torch.linalg import norm

	# adapted to PyTorch from:
	# https://gist.github.com/dvschultz/3af50c40df002da3b751efab1daddf2c
	# most of the extra complexity is to support:
	# - many-dimensional vectors
	# - v0 or v1 with last dim all zeroes, or v0 ~colinear with v1
	# - falls back to lerp()
	# - conditional logic implemented with parallelism rather than Python loops

	Iee9keaYk+mfs+S5kAp8fG11c2ljLjE2My5jb20KQEAqLm11c2ljLjEyNi5uZXQKCiFRUemfs+S5kAp8
	fHkucXEuY29tXgp8fGkueS5xcS5jb20vdjgvcGxheXNvbmcuaHRtbAp8fGMueS5xcS5jb20vdjgvZmNn
	LWJpbi9mY2dfcGxheV9zaW5nbGVfc29uZy5mY2cKQEBkbC5zdHJlYW0ucXFtdXNpYy5xcS5jb20KCiHp
	hbfni5fpn7PkuZAKfHxrdWdvdS5jb21eCnx8aXAua3Vnb3UuY29tL2NoZWNrL2lzY24KQEBmcy5vcGVu
	Lmt1Z291LmNvbQoKIemFt+aIkemfs+S5kAp8fGt1d28uY25eCnx8aXBjaGVjay5rdXdvLmNuL2lwX2No
	ZWNrLmt1d28KQEBzeWNkbi5rdXdvLmNuXgoKIeeZvuW6pumfs+S5kAp8fG11c2ljLmJhaWR1LmNvbS9k
	YXRhL3VzZXIvbG9jYXRpb24KQEB5aW55dWVzaGl0aW5nLmJhaWR1LmNvbQo=

	'''This script goes along the blog post
	"Building powerful image classification models using very little data"
	from blog.keras.io.
	It uses data that can be downloaded at:
	https://www.kaggle.com/c/dogs-vs-cats/data
	In our setup, we:
	- created a data/ folder
	- created train/ and validation/ subfolders inside data/
	- created cats/ and dogs/ subfolders inside train/ and validation/
	- put the cat pictures index 0-999 in data/train/cats

	from keras.models import Sequential
	from keras.layers import Convolution2D, ZeroPadding2D, MaxPooling2D

	img_width, img_height = 128, 128

	# this will contain our generated images
	input_img = K.placeholder((1, 3, img_width, img_height))

	# build the VGG16 network with our input_img as input
	first_layer = ZeroPadding2D((1, 1), input_shape=(3, img_width, img_height))