We are excited to announce Dream 7B, a state-of-the-art diffusion reasoning model that stands as the most powerful open diffusion large language model to date.
In summary, Dream 7B:
- Outperforms existing diffusion language models by a substantial margin.
- Matches or exceeds the performance of top-tier autoregressive (AR) language models of similar size across general, mathematical, and coding abilities.
- Demonstrates strong planning abilities and inference flexibility.
Surprisingly, on planning tasks such as Countdown and Sudoku, Dream 7B significantly surpasses Qwen2.5 7B and LLaMA3 8B, even without any task-specific training. In certain instances, it even outperforms the latest DeepSeek V3, which contains orders of magnitude more parameters.
Dream 7B also allows for outputs to be synthesized in arbitrary orders, providing flexibility in how information is presented:
By adjusting the diffusion timesteps, the performance of Dream 7B can be flexibly tuned for either speed or quality.
For more details, please refer to our blog post: Dream 7B Blog Post
We extend our heartfelt thanks to the incredible team members who contributed to this achievement: @_zhihuixie, @linzhengisme, @jiahuigao3, @WilliamZR7, Xin Jiang, Zhenguo Li, and @ikekong.
Generated by tweet-to-markdown