| --- |
| pipeline_tag: text-to-image |
| library_name: diffusers |
| tags: |
| - Chroma |
| - quantization |
| - svdquant |
| - nunchaku |
| - fp4 |
| - int4 |
| base_model: tonera/Chroma1-HD-SVDQ |
| base_model_relation: quantized |
| license: apache-2.0 |
| --- |
| |
| # Model Card (SVDQuant) |
|
|
| > **Language**: English | [中文](README_CN.md) |
|
|
|  |
|
|
|
|
| ## Model name |
|
|
| - **Model repo**: `tonera/Chroma1-HD-SVDQ` |
| - **Base (Diffusers weights path)**: `tonera/Chroma1-HD-SVDQ` (repo root) |
| - **Quantized Transformer weights**: `tonera/Chroma1-HD-SVDQ/svdq-<precision>_r32-Chroma1-HD.safetensors` |
|
|
| ## Quantization / inference tech |
|
|
| - **Inference engine**: Nunchaku (`https://github.com/nunchaku-ai/nunchaku`) |
|
|
| Nunchaku is a high-performance inference engine for **4-bit (FP4/INT4) low-bit neural networks**. Its goal is to significantly reduce VRAM usage and improve inference speed while preserving generation quality as much as possible. It implements and productionizes post-training quantization methods such as **SVDQuant**, and uses operator/kernel fusion and other optimizations to reduce the extra overhead introduced by low-rank branches. |
|
|
| The Chroma1-HD quantized weights in this repository (e.g. `svdq-*_r32-*.safetensors`) are meant to be used with Nunchaku for efficient inference on supported GPUs. |
|
|
| ## You must install Nunchaku before use |
|
|
| - **Official installation docs** (recommended source of truth): `https://nunchaku.tech/docs/nunchaku/installation/installation.html` |
|
|
| ### (Recommended) Install the official prebuilt wheel |
|
|
| - **Prerequisite**: `PyTorch >= 2.5` (follow the wheel requirements as the source of truth) |
| - **Install the nunchaku wheel**: pick the wheel matching your environment from GitHub Releases / HuggingFace / ModelScope (note `cp311` means Python 3.11): |
| - `https://github.com/nunchaku-ai/nunchaku/releases` |
|
|
| ```bash |
| # Example (choose the correct wheel URL for your torch/cuda/python versions) |
| pip install https://github.com/nunchaku-ai/nunchaku/releases/download/vX.Y.Z/nunchaku-X.Y.Z+torch2.9-cp311-cp311-linux_x86_64.whl |
| ``` |
|
|
| - **Tip (RTX 50 series GPUs)**: usually `CUDA >= 12.8` is recommended, and FP4 models are preferred for better compatibility and performance (follow the official docs). |
|
|
| ## Usage example (Diffusers + Nunchaku Transformer) |
| Note: I am pushing for the official Nunchaku PR to be merged: https://github.com/nunchaku-ai/nunchaku/pull/928 |
| Until then, if you want to try it out, you can copy `transformer_chroma.py` from the repository to `nunchaku/models/transformers/transformer_chroma.py`. |
| Usage like this: |
| ``` |
| from nunchaku.models.transformers.transformer_chroma import NunchakuChromaTransformer2dModel |
| ``` |
|
|
| ```python |
| import torch |
| from diffusers import ChromaPipeline |
| |
| from nunchaku import NunchakuChromaTransformer2dModel |
| from nunchaku.utils import get_precision |
| |
| MODEL = "Chroma1-HD-SVDQ" |
| REPO_ID = f"tonera/{MODEL}" |
| |
| if __name__ == "__main__": |
| transformer = NunchakuChromaTransformer2dModel.from_pretrained( |
| f"{REPO_ID}/svdq-{get_precision()}_r32-{MODEL}.safetensors" |
| ) |
| |
| pipe = ChromaPipeline.from_pretrained( |
| f"{REPO_ID}", |
| transformer=transformer, |
| torch_dtype=torch.bfloat16, |
| use_safetensors=True, |
| ).to("cuda") |
| |
| prompt = "Make Pikachu hold a sign that says 'Nunchaku is awesome', yarn art style, detailed, vibrant colors" |
| image = pipe(prompt=prompt, guidance_scale=2.5, num_inference_steps=40).images[0] |
| image.save("Chroma1.png") |
| ``` |
|
|
|
|