Update README.md
Browse files
README.md
CHANGED
|
@@ -23,36 +23,24 @@ Welcome to the official repository for the Z-Image(造相)project!
|
|
| 23 |
|
| 24 |
## 🎨 Z-Image
|
| 25 |
|
| 26 |
-
**Z-Image** is the foundation model
|
|
|
|
|
|
|
| 27 |
|
| 28 |
### 🌟 Key Features
|
| 29 |
|
| 30 |
-
|
| 31 |
-
Z-Image
|
| 32 |
-
|
| 33 |
-
|
| 34 |
-
|
| 35 |
-
Z-Image emphasizes diversity across multiple generative dimensions:
|
| 36 |
-
- Variations in facial identity, body pose, composition, and layout across different seeds
|
| 37 |
-
- Distinct appearances for individuals in multi-person scenes
|
| 38 |
-
- Higher overall variability compared to heavily speed-optimized models
|
| 39 |
-
|
| 40 |
-
#### 🛠 Foundation Model for Fine-tuning & Control
|
| 41 |
-
Z-Image is a non-distilled base model for downstream development:
|
| 42 |
-
- Compatible with parameter-efficient fine-tuning methods
|
| 43 |
-
- Extendable with structural conditioning approaches
|
| 44 |
-
- Supports full Classifier-Free Guidance (CFG) for precise prompt control
|
| 45 |
-
|
| 46 |
-
#### 🚫 Effective Negative Prompting
|
| 47 |
-
Z-Image responds strongly to negative prompts, enabling reliable suppression of unwanted artifacts, styles, and compositional errors.
|
| 48 |
-
|
| 49 |
|
| 50 |
### 🆚 Z-Image vs Z-Image-Turbo
|
| 51 |
|
| 52 |
| Aspect | Z-Image | Z-Image-Turbo |
|
| 53 |
|------|------|------|
|
| 54 |
| CFG | ✅ | ❌ |
|
| 55 |
-
| Steps | 50 | 8 |
|
| 56 |
| Fintunablity | ✅ | ❌ |
|
| 57 |
| Negative Prompting | ✅ | ❌ |
|
| 58 |
| Diversity | High | Low |
|
|
@@ -77,8 +65,6 @@ HF_XET_HIGH_PERFORMANCE=1 hf download Tongyi-MAI/Z-Image
|
|
| 77 |
- **Resolution:** 512×512 to 2048×2048 (total pixel area, any aspect ratio)
|
| 78 |
- **Guidance scale:** 3.0 – 5.0
|
| 79 |
- **Inference steps:** 28 – 50
|
| 80 |
-
- **Negative prompts:** Strongly recommended for better control
|
| 81 |
-
- **CFG normalization:** `False` for general stylism, `True` for realism
|
| 82 |
|
| 83 |
### Usage Example
|
| 84 |
|
|
|
|
| 23 |
|
| 24 |
## 🎨 Z-Image
|
| 25 |
|
| 26 |
+
**Z-Image** is the foundation model of the ⚡️- Image family, engineered for good quality, robust generative diversity, and broad stylistic coverage.
|
| 27 |
+
While Z-Image-Turbo is built for speed,
|
| 28 |
+
Z-Image is a full-capacity, undistilled transformer designed to be the backbone for creators, researchers, and developers who require the highest level of creative freedom.
|
| 29 |
|
| 30 |
### 🌟 Key Features
|
| 31 |
|
| 32 |
+
- **Undistilled Foundation**: As a non-distilled base model, Z-Image preserves the complete training signal. It supports full Classifier-Free Guidance (CFG), providing the precision required for complex prompt engineering and professional workflows.
|
| 33 |
+
- **Aesthetic Versatility**: Z-Image masters a vast spectrum of visual languages—from hyper-realistic photography and cinematic digital art to intricate anime and stylized illustrations. It is the ideal engine for scenarios requiring rich, multi-dimensional expression.
|
| 34 |
+
- **Enhanced Output Diversity**: Built for exploration, Z-Image delivers significantly higher variability in composition, facial identity, and lighting across different seeds, ensuring that multi-person scenes remain distinct and dynamic.
|
| 35 |
+
- **Built for Development**: The ideal starting point for the community. Its non-distilled nature makes it a good base for LoRA training, structural conditioning (ControlNet) and semantic conditioning.
|
| 36 |
+
- **Robust Negative Control**: Responds with high fidelity to negative prompting, allowing users to reliably suppress artifacts and adjust compositions.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 37 |
|
| 38 |
### 🆚 Z-Image vs Z-Image-Turbo
|
| 39 |
|
| 40 |
| Aspect | Z-Image | Z-Image-Turbo |
|
| 41 |
|------|------|------|
|
| 42 |
| CFG | ✅ | ❌ |
|
| 43 |
+
| Steps | 28~50 | 8 |
|
| 44 |
| Fintunablity | ✅ | ❌ |
|
| 45 |
| Negative Prompting | ✅ | ❌ |
|
| 46 |
| Diversity | High | Low |
|
|
|
|
| 65 |
- **Resolution:** 512×512 to 2048×2048 (total pixel area, any aspect ratio)
|
| 66 |
- **Guidance scale:** 3.0 – 5.0
|
| 67 |
- **Inference steps:** 28 – 50
|
|
|
|
|
|
|
| 68 |
|
| 69 |
### Usage Example
|
| 70 |
|