Dream is a new LLM developed by the HKU NLP Group. It uses the diffusion architecture typically used by image generation AI for text. In other words, instead of generation text sequentially, it gradually generates text in the style of "computer, enhance image!".

The Dream v0 Instruct 7B model is small enough to run locally on llama.cpp (4.68 GB in Q4_K_M quantisation).

llama.cpp added diffusion support around July 2025.

Demo with denoising process visualisation: https://huggingface.co/spaces/multimodalart/Dream

The specific CLI switches are only implemented in diffusion-cli, not llama-server which is designed to process autoregressive causal language models with a KV cache.

c.f. https://lab.cloud/blog/text-diffusion-support/

Downloads last month
4
GGUF
Model size
8B params
Architecture
dream
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for keisuke-miyako/Dream-v0-Instruct-7B-gguf-q4_k_m

Quantized
(10)
this model

Collection including keisuke-miyako/Dream-v0-Instruct-7B-gguf-q4_k_m