Skip to content

🔥ImageFolder: Autoregressive Image Generation with Folded Tokens

Notifications You must be signed in to change notification settings

lxa9867/ImageFolder

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 

Repository files navigation

ImageFolder🚀: Autoregressive Image Generation with Folded Tokens

project page  arXiv  huggingface weights 

Updates

  • (2024.10.03) We are working on advanced training of ImageFolder tokenizer. The code and weights will be released after we finish advanced training.
  • (2024.10.01) Repo created. Code and checkpoints will be released soon.

Ablation (updating)

ID Method Length rFID ↓ gFID ↓
🔶1 Multi-scale residual quantization (Tian et al., 2024) 680 1.92 7.52
🔶2 + Quantizer dropout 680 1.71 6.03
🔶3 + Smaller patch size K = 11 265 3.24 6.56
🔶4 + Product quantization & Parallel decoding 265 2.06 5.96
🔶5 + Semantic regularization on all branches 265 1.97 5.21
🔶6 + Semantic regularization on one branch 265 1.57 3.53
🔷7 + Stronger discriminator 265 1.18 -

🔶1-6 are already in the released paper, and after that 🔷7+ are advanced training settings.

Generation

Visualization of Decomposed Token

Citation

If our work assists your research, feel free to give us a star ⭐ or cite us using

@misc{li2024imagefolderautoregressiveimagegeneration,
      title={ImageFolder: Autoregressive Image Generation with Folded Tokens}, 
      author={Xiang Li and Hao Chen and Kai Qiu and Jason Kuen and Jiuxiang Gu and Bhiksha Raj and Zhe Lin},
      year={2024},
      eprint={2410.01756},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2410.01756}, 
}

About

🔥ImageFolder: Autoregressive Image Generation with Folded Tokens

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published