- (2024.10.03) We are working on advanced training of ImageFolder tokenizer. The code and weights will be released after we finish advanced training.
- (2024.10.01) Repo created. Code and checkpoints will be released soon.
ID | Method | Length | rFID ↓ | gFID ↓ |
---|---|---|---|---|
🔶1 | Multi-scale residual quantization (Tian et al., 2024) | 680 | 1.92 | 7.52 |
🔶2 | + Quantizer dropout | 680 | 1.71 | 6.03 |
🔶3 | + Smaller patch size K = 11 | 265 | 3.24 | 6.56 |
🔶4 | + Product quantization & Parallel decoding | 265 | 2.06 | 5.96 |
🔶5 | + Semantic regularization on all branches | 265 | 1.97 | 5.21 |
🔶6 | + Semantic regularization on one branch | 265 | 1.57 | 3.53 |
🔷7 | + Stronger discriminator | 265 | 1.18 | - |
🔶1-6 are already in the released paper, and after that 🔷7+ are advanced training settings.
If our work assists your research, feel free to give us a star ⭐ or cite us using
@misc{li2024imagefolderautoregressiveimagegeneration,
title={ImageFolder: Autoregressive Image Generation with Folded Tokens},
author={Xiang Li and Hao Chen and Kai Qiu and Jason Kuen and Jiuxiang Gu and Bhiksha Raj and Zhe Lin},
year={2024},
eprint={2410.01756},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2410.01756},
}