Announcement of Dream 7B: The Most Powerful Open Diffusion Large Language Model

We are excited to announce Dream 7B, a state-of-the-art diffusion reasoning model that stands as the most powerful open diffusion large language model to date.

Key Features of Dream 7B

In summary, Dream 7B:

Outperforms existing diffusion language models by a substantial margin.
Matches or exceeds the performance of top-tier autoregressive (AR) language models of similar size across general, mathematical, and coding abilities.
Demonstrates strong planning abilities and inference flexibility.

Performance on Planning Tasks

Surprisingly, on planning tasks such as Countdown and Sudoku, Dream 7B significantly surpasses Qwen2.5 7B and LLaMA3 8B, even without any task-specific training. In certain instances, it even outperforms the latest DeepSeek V3, which contains orders of magnitude more parameters.

Output Synthesis Capabilities

Dream 7B also allows for outputs to be synthesized in arbitrary orders, providing flexibility in how information is presented:

Completion Example:

(Insert example here)
Infilling Example with an Exact Ending Sentence:

Performance Tuning

By adjusting the diffusion timesteps, the performance of Dream 7B can be flexibly tuned for either speed or quality.

Further Information

For more details, please refer to our blog post: Dream 7B Blog Post

Acknowledgments

We extend our heartfelt thanks to the incredible team members who contributed to this achievement: @_zhihuixie, @linzhengisme, @jiahuigao3, @WilliamZR7, Xin Jiang, Zhenguo Li, and @ikekong.

Generated by tweet-to-markdown

josherich/dream-7b.md