Computed on the validation set of ImageNet 256 (50000 images).
| model | encoder | decoder | rFID | LPIPS | PSNR |
|---|---|---|---|---|---|
| SD 1.5 | base | base | 0.9275 | 0.0753 | 25.35 |
| SD 1.5 | TAE | base | 2.5711 | 0.1109 | 23.89 |
| SD 1.5 | base | TAE | 2.9040 | 0.0995 | 23.47 |
| SD 1.5 | TAE | TAE | 3.8339 | 0.1092 | 23.52 |
| sdxl | base | base | 0.7690 | 0.0703 | 25.60 |
| sdxl | TAE | base | 1.4093 | 0.0888 | 24.83 |