An interactive autoregressive world model with real-time camera control, prompt switching, and long-horizon memory consistency.
- [2026-06-30] Project page and technical report released.
- Inference code
- Pretrained weights
- Training code
- Training data (partial)
ChuniWorld is built around four core properties — agency, persistence, durability, and responsiveness.
Two control channels: a rendered 3D cache with lightweight AdaLN camera modulation for grounded, trajectory-aware navigation, and chunk-level prompt switching to introduce new events mid-generation.
World-consistent memory — an explicit 3D cache reprojected to the queried view for spatial recall, plus a compressed frame-history embedding for temporal continuity, so revisited places stay recognizable.
Long-horizon stability from training on drifted histories and an error bank that re-injects accumulated artifacts into both memory and target, preventing errors from compounding over minute-long rollouts.
Real-time interaction via few-step DMD distillation and short temporal chunks, with prompt switching at chunk boundaries to minimize both visual and semantic latency.
- Core Lead: Kaipeng Zhang
- Lead: Chuanhao Li
- Core Contributors: Chuanhao Li, Kaipeng Zhang, Yifan Zhan, Yongtao Ge, Yuanyang Yin
- Contributors: Jiaming Tan, Kang He, Liaoyuan Fan, Ruicong Liu, Xiaojie Xu, Xuangeng Chu, Zhen Li, Zhengyuan Lin, Zhixiang Wang, Zian Meng, Zihui Gao
For collaboration or business inquiries, contact kaipeng.zhang@shanda.com.
If you find ChuniWorld useful for your research, please cite:
@article{chuniworld2026,
title = {ChuniWorld: Interactive Long-Horizon World Modeling Toward Generative Reality},
author = {Alaya Lab},
journal = {arXiv preprint},
year = {2026}
}See LICENSE.
