QJerry commited on
Commit
108d6b8
Β·
verified Β·
1 Parent(s): 8be323d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +47 -33
README.md CHANGED
@@ -6,7 +6,6 @@ pipeline_tag: text-to-image
6
  library_name: diffusers
7
  ---
8
 
9
-
10
  <h1 align="center">⚑️- Image<br><sub><sup>An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer</sup></sub></h1>
11
 
12
  <div align="center">
@@ -18,50 +17,71 @@ library_name: diffusers
18
  [![ModelScope Space](https://img.shields.io/badge/πŸ€–%20Online_Demo-Z--Image-17c7a7)](https://www.modelscope.cn/aigc/imageGeneration?tab=advanced&versionId=569345&modelType=Checkpoint&sdVersion=Z_IMAGE&modelUrl=modelscope%3A%2F%2FTongyi-MAI%2FZ-Image%3Frevision%3Dmaster)&#160;
19
  <a href="https://arxiv.org/abs/2511.22699" target="_blank"><img src="https://img.shields.io/badge/Report-b5212f.svg?logo=arxiv" height="21px"></a>
20
 
21
- <!-- [![Hugging Face](https://img.shields.io/badge/%F0%9F%A4%97%20Online_Demo-Z--Image-blue)](https://huggingface.co/spaces/Tongyi-MAI/Z-Image)&#160; -->
22
- <!-- [![Art Gallery PDF](https://img.shields.io/badge/%F0%9F%96%BC%20Art_Gallery-PDF-ff69b4)](assets/Z-Image-Gallery.pdf)&#160;
23
- [![Web Art Gallery](https://img.shields.io/badge/%F0%9F%8C%90%20Web_Art_Gallery-online-00bfff)](https://modelscope.cn/studios/Tongyi-MAI/Z-Image-Gallery/summary)&#160; -->
24
-
25
-
26
- Welcome to the official repository for the Z-ImageοΌˆι€ η›ΈοΌ‰project!
27
 
28
  </div>
29
 
 
30
 
 
 
31
 
32
- ## ✨ Z-Image
33
 
34
- We are excited to introduce **Z-Image**, a powerful and efficient image generation model with **6B** parameters. While **Z-Image-Turbo** is designed for speed, the standard **Z-Image** stands out as our primary community foundation model, delivering higher flexibility in generation and style, excellent generative quality and aesthetics, and exceptional support for robust secondary development.
 
 
35
 
36
- <!-- πŸ“Έ **Photorealistic Quality**: **Z-Image-Turbo** delivers strong photorealistic image generation while maintaining excellent aesthetic quality.
 
 
 
 
37
 
38
- ![Showcase of Z-Image on Photo-realistic image Generation](assets/showcase_realistic.png) -->
 
 
 
 
39
 
40
- ### 🌟 Key Features
 
41
 
42
- #### 🎨 Aesthetic & Artistic Diversity
43
- Z-Image maintains high photorealism while supporting a wider range of artistic styles. Unlike the Turbo version, which is heavily optimized for realism via RL, Z-Image preserves more stylistic varietyβ€”making it better suited for anime, digital art, and other creative genres.
44
 
45
- #### πŸ›  Fine-tuning & Community Development
46
- Z-Image is a non-distilled base model, making it a more flexible starting point for fine-tuning (LoRA, ControlNet, etc.).
47
- * **CFG Support:** Unlike distilled models that often bypass Classifier-Free Guidance, Z-Image retains full CFG support for precise prompt control.
48
- * **Training Stability:** The model's internal diversity and weight distribution make it more receptive to learning new concepts during downstream training compared to low-step variants.
49
 
50
- #### 🧬 Improved Generative Diversity
51
- We have focused on solving the homogenization issues common in many modern generators:
52
- * **Distinct Identities:** Different seeds produce noticeably different faces and compositions, avoiding the "same face" problem across generations.
53
- * **Multi-subject Scenes:** In prompts with multiple people, Z-Image generates individuals with unique features instead of the "cloning effect" often seen in high-speed models.
 
 
 
54
 
55
- #### 🚫 Effective Negative Prompting
56
- Z-Image is highly responsive to Negative Prompts. This allows for better steerability and more control over the final output, effectively filtering out unwanted elements or artifacts.
57
 
 
58
 
59
- ### πŸš€ Quick Start
60
- Install the latest version of diffusers, use the following command:
61
  ```bash
62
  pip install git+https://github.com/huggingface/diffusers
63
  ```
64
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
65
  ```python
66
  import torch
67
  from diffusers import ZImagePipeline
@@ -84,7 +104,7 @@ image = pipe(
84
  negative_prompt=negative_prompt,
85
  height=1280,
86
  width=720,
87
- cfg_normalization=True, # could switch if needed
88
  num_inference_steps=50, # May use 28-50 for Z-Image Model
89
  guidance_scale=4.0, # Suggested guidance scale is 3.0 to 5.0 for Z-Image Model
90
  generator=torch.Generator("cuda").manual_seed(42),
@@ -93,12 +113,6 @@ image = pipe(
93
  image.save("example.png")
94
  ```
95
 
96
- ## ⏬ Download
97
- ```bash
98
- pip install -U huggingface_hub
99
- HF_XET_HIGH_PERFORMANCE=1 hf download Tongyi-MAI/Z-Image
100
- ```
101
-
102
  ## πŸ“œ Citation
103
 
104
  If you find our work useful in your research, please consider citing:
 
6
  library_name: diffusers
7
  ---
8
 
 
9
  <h1 align="center">⚑️- Image<br><sub><sup>An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer</sup></sub></h1>
10
 
11
  <div align="center">
 
17
  [![ModelScope Space](https://img.shields.io/badge/πŸ€–%20Online_Demo-Z--Image-17c7a7)](https://www.modelscope.cn/aigc/imageGeneration?tab=advanced&versionId=569345&modelType=Checkpoint&sdVersion=Z_IMAGE&modelUrl=modelscope%3A%2F%2FTongyi-MAI%2FZ-Image%3Frevision%3Dmaster)&#160;
18
  <a href="https://arxiv.org/abs/2511.22699" target="_blank"><img src="https://img.shields.io/badge/Report-b5212f.svg?logo=arxiv" height="21px"></a>
19
 
20
+ Welcome to the official repository for the ⚑️- Image family!
 
 
 
 
 
21
 
22
  </div>
23
 
24
+ ## 🎨 Z-Image
25
 
26
+ **Z-Image** is the foundation model behind Z-Image-Turbo, designed for high-quality image generation with strong controllability, broad stylistic coverage, and support for downstream development.
27
+ It serves as the primary community model in the ⚑️- Image family, while Z-Image-Turbo focuses on high-speed inference.
28
 
29
+ ### 🌟 Key Features
30
 
31
+ #### 🎨 Aesthetic & Artistic Diversity
32
+ Z-Image supports a wide range of aesthetics and artistic styles, including realistic photography, anime, illustration, digital art, and stylized visuals.
33
+ It is suitable for creative scenarios that require rich stylistic expression rather than a single preferred aesthetic.
34
 
35
+ #### 🧬 Generative Diversity
36
+ Z-Image emphasizes diversity across multiple generative dimensions:
37
+ - Variations in facial identity, body pose, composition, and layout across different seeds
38
+ - Distinct appearances for individuals in multi-person scenes
39
+ - Higher overall variability compared to heavily speed-optimized models
40
 
41
+ #### πŸ›  Foundation Model for Fine-tuning & Control
42
+ Z-Image is a non-distilled base model for downstream development:
43
+ - Compatible with parameter-efficient fine-tuning methods
44
+ - Extendable with structural conditioning approaches
45
+ - Supports full Classifier-Free Guidance (CFG) for precise prompt control
46
 
47
+ #### 🚫 Effective Negative Prompting
48
+ Z-Image responds strongly to negative prompts, enabling reliable suppression of unwanted artifacts, styles, and compositional errors.
49
 
 
 
50
 
51
+ ### πŸ†š Z-Image vs Z-Image-Turbo
 
 
 
52
 
53
+ | Aspect | Z-Image | Z-Image-Turbo |
54
+ |------|------|------|
55
+ | CFG support | Yes | No |
56
+ | Fine-tuning | Yes | Limited |
57
+ | Aesthetic diversity | High | Reduced |
58
+ | Negative prompt control | Strong | None |
59
+ | Inference speed | Slower | Faster |
60
 
61
+ ## πŸš€ Quick Start
 
62
 
63
+ ### Installation & Download
64
 
65
+ Install the latest version of diffusers:
 
66
  ```bash
67
  pip install git+https://github.com/huggingface/diffusers
68
  ```
69
 
70
+ Download the model:
71
+ ```bash
72
+ pip install -U huggingface_hub
73
+ HF_XET_HIGH_PERFORMANCE=1 hf download Tongyi-MAI/Z-Image
74
+ ```
75
+
76
+ ### Recommended Parameters
77
+
78
+ - **Resolution:** 512Γ—512 to 2048Γ—2048 (total pixel area, any aspect ratio)
79
+ - **Guidance scale:** 3.0 – 5.0
80
+ - **Inference steps:** 28 – 50
81
+ - **Negative prompts:** Strongly recommended for better control
82
+
83
+ ### Usage Example
84
+
85
  ```python
86
  import torch
87
  from diffusers import ZImagePipeline
 
104
  negative_prompt=negative_prompt,
105
  height=1280,
106
  width=720,
107
+ cfg_normalization=False, # Could switch if needed: True for more realism, False for general stylism
108
  num_inference_steps=50, # May use 28-50 for Z-Image Model
109
  guidance_scale=4.0, # Suggested guidance scale is 3.0 to 5.0 for Z-Image Model
110
  generator=torch.Generator("cuda").manual_seed(42),
 
113
  image.save("example.png")
114
  ```
115
 
 
 
 
 
 
 
116
  ## πŸ“œ Citation
117
 
118
  If you find our work useful in your research, please consider citing: