QJerry commited on
Commit
8be323d
·
verified ·
1 Parent(s): aa9e083

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +113 -3
README.md CHANGED
@@ -1,3 +1,113 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ pipeline_tag: text-to-image
6
+ library_name: diffusers
7
+ ---
8
+
9
+
10
+ <h1 align="center">⚡️- Image<br><sub><sup>An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer</sup></sub></h1>
11
+
12
+ <div align="center">
13
+
14
+ [![Official Site](https://img.shields.io/badge/Official%20Site-333399.svg?logo=homepage)](https://tongyi-mai.github.io/Z-Image-blog/)&#160;
15
+ [![GitHub](https://img.shields.io/badge/GitHub-Z--Image-181717?logo=github&logoColor=white)](https://github.com/Tongyi-MAI/Z-Image)&#160;
16
+ [![Hugging Face](https://img.shields.io/badge/%F0%9F%A4%97%20Checkpoint-Z--Image-yellow)](https://huggingface.co/Tongyi-MAI/Z-Image)&#160;
17
+ [![ModelScope Model](https://img.shields.io/badge/🤖%20Checkpoint-Z--Image-624aff)](https://www.modelscope.cn/models/Tongyi-MAI/Z-Image)&#160;
18
+ [![ModelScope Space](https://img.shields.io/badge/🤖%20Online_Demo-Z--Image-17c7a7)](https://www.modelscope.cn/aigc/imageGeneration?tab=advanced&versionId=569345&modelType=Checkpoint&sdVersion=Z_IMAGE&modelUrl=modelscope%3A%2F%2FTongyi-MAI%2FZ-Image%3Frevision%3Dmaster)&#160;
19
+ <a href="https://arxiv.org/abs/2511.22699" target="_blank"><img src="https://img.shields.io/badge/Report-b5212f.svg?logo=arxiv" height="21px"></a>
20
+
21
+ <!-- [![Hugging Face](https://img.shields.io/badge/%F0%9F%A4%97%20Online_Demo-Z--Image-blue)](https://huggingface.co/spaces/Tongyi-MAI/Z-Image)&#160; -->
22
+ <!-- [![Art Gallery PDF](https://img.shields.io/badge/%F0%9F%96%BC%20Art_Gallery-PDF-ff69b4)](assets/Z-Image-Gallery.pdf)&#160;
23
+ [![Web Art Gallery](https://img.shields.io/badge/%F0%9F%8C%90%20Web_Art_Gallery-online-00bfff)](https://modelscope.cn/studios/Tongyi-MAI/Z-Image-Gallery/summary)&#160; -->
24
+
25
+
26
+ Welcome to the official repository for the Z-Image(造相)project!
27
+
28
+ </div>
29
+
30
+
31
+
32
+ ## ✨ Z-Image
33
+
34
+ We are excited to introduce **Z-Image**, a powerful and efficient image generation model with **6B** parameters. While **Z-Image-Turbo** is designed for speed, the standard **Z-Image** stands out as our primary community foundation model, delivering higher flexibility in generation and style, excellent generative quality and aesthetics, and exceptional support for robust secondary development.
35
+
36
+ <!-- 📸 **Photorealistic Quality**: **Z-Image-Turbo** delivers strong photorealistic image generation while maintaining excellent aesthetic quality.
37
+
38
+ ![Showcase of Z-Image on Photo-realistic image Generation](assets/showcase_realistic.png) -->
39
+
40
+ ### 🌟 Key Features
41
+
42
+ #### 🎨 Aesthetic & Artistic Diversity
43
+ Z-Image maintains high photorealism while supporting a wider range of artistic styles. Unlike the Turbo version, which is heavily optimized for realism via RL, Z-Image preserves more stylistic variety—making it better suited for anime, digital art, and other creative genres.
44
+
45
+ #### 🛠 Fine-tuning & Community Development
46
+ Z-Image is a non-distilled base model, making it a more flexible starting point for fine-tuning (LoRA, ControlNet, etc.).
47
+ * **CFG Support:** Unlike distilled models that often bypass Classifier-Free Guidance, Z-Image retains full CFG support for precise prompt control.
48
+ * **Training Stability:** The model's internal diversity and weight distribution make it more receptive to learning new concepts during downstream training compared to low-step variants.
49
+
50
+ #### 🧬 Improved Generative Diversity
51
+ We have focused on solving the homogenization issues common in many modern generators:
52
+ * **Distinct Identities:** Different seeds produce noticeably different faces and compositions, avoiding the "same face" problem across generations.
53
+ * **Multi-subject Scenes:** In prompts with multiple people, Z-Image generates individuals with unique features instead of the "cloning effect" often seen in high-speed models.
54
+
55
+ #### 🚫 Effective Negative Prompting
56
+ Z-Image is highly responsive to Negative Prompts. This allows for better steerability and more control over the final output, effectively filtering out unwanted elements or artifacts.
57
+
58
+
59
+ ### 🚀 Quick Start
60
+ Install the latest version of diffusers, use the following command:
61
+ ```bash
62
+ pip install git+https://github.com/huggingface/diffusers
63
+ ```
64
+
65
+ ```python
66
+ import torch
67
+ from diffusers import ZImagePipeline
68
+
69
+ # 1. Load the pipeline
70
+ # Use bfloat16 for optimal performance on supported GPUs
71
+ pipe = ZImagePipeline.from_pretrained(
72
+ "Tongyi-MAI/Z-Image",
73
+ torch_dtype=torch.bfloat16,
74
+ low_cpu_mem_usage=False,
75
+ )
76
+ pipe.to("cuda")
77
+
78
+ # 2. Generate Image
79
+ prompt = "两名年轻亚裔女性紧密站在一起,背景为朴素的灰色纹理墙面,可能是室内地毯地面。左侧女性留着长卷发,身穿藏青色毛衣,左袖有奶油色褶皱装饰,内搭白色立领衬衫,下身白色裤子;佩戴小巧金色耳钉,双臂交叉于背后。右侧女性留直肩长发,身穿奶油色卫衣,胸前印有“Tunthetables”字样,下方为“New ideas”,搭配白色裤子;佩戴银色小环耳环,双臂交叉于胸前。两人均面带微笑直视镜头。���片,自然光照明,柔和阴影,以藏青、奶油白为主的中性色调,休闲时尚摄影,中等景深,面部和上半身对焦清晰,姿态放松,表情友好,室内环境,地毯地面,纯色背景。"
80
+ negative_prompt = "" # optional, but would be powerful when you want to remove some unwanted content
81
+
82
+ image = pipe(
83
+ prompt=prompt,
84
+ negative_prompt=negative_prompt,
85
+ height=1280,
86
+ width=720,
87
+ cfg_normalization=True, # could switch if needed
88
+ num_inference_steps=50, # May use 28-50 for Z-Image Model
89
+ guidance_scale=4.0, # Suggested guidance scale is 3.0 to 5.0 for Z-Image Model
90
+ generator=torch.Generator("cuda").manual_seed(42),
91
+ ).images[0]
92
+
93
+ image.save("example.png")
94
+ ```
95
+
96
+ ## ⏬ Download
97
+ ```bash
98
+ pip install -U huggingface_hub
99
+ HF_XET_HIGH_PERFORMANCE=1 hf download Tongyi-MAI/Z-Image
100
+ ```
101
+
102
+ ## 📜 Citation
103
+
104
+ If you find our work useful in your research, please consider citing:
105
+
106
+ ```bibtex
107
+ @article{team2025zimage,
108
+ title={Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer},
109
+ author={Z-Image Team},
110
+ journal={arXiv preprint arXiv:2511.22699},
111
+ year={2025}
112
+ }
113
+ ```