This is an English description of the .PMX file format used in Miku Miku Dance (MMD).
PMX is the successor to the .PMD format (Polygon Model Data).
This is work-in-progress! Please leave feedback in the comments.
tl;dr use Linux, install bitsandbytes
(either globally or in KAI's conda env, add load_in_8bit=True
, device_map="auto"
in model pipeline creation calls)
Many people are unable to load models due to their GPU's limited VRAM. These models contain billions of parameters (model weights and biases), each of which is a 32 (or 16) bit float. Thanks to the hard work of some researchers [1], it's possible to run these models using 8-bit numbers, which halves the required amount of VRAM compared to running in half-precision. E.g. if a model requires 16GB of VRAM, running with 8-bit inference only requires 8GB.
This guide was written for KoboldAI 1.19.1, and tested with Ubuntu 20.04. These instructions are based on work by Gmin
in KoboldAI's Discord server, and Huggingface's efficient LM inference guide.