Model Card for Klein-9B KV (MXFP8 / NVFP4)
Quantized KV-cache optimized variant of FLUX.2 Klein 9B for faster and more memory-efficient image generation.
Model Details
Model Description
Klein-9B KV (MXFP8 / NVFP4) is a quantized version of FLUX.2 Klein 9B with KV-cache support for improved performance in multi-reference and iterative workflows.
- Developed by: Black Forest Labs (base), Winnougan (quantization)
- Model type: Text-to-image / image-to-image generative model
- License: Apache-2.0 (repo), base model license applies
- Finetuned from model: FLUX.2-klein-9b-kv-fp8
Model Sources
Uses
Direct Use
- Text-to-image
- Image-to-image
- Multi-reference workflows (KV-cache)
Out-of-Scope Use
- Factual or reliable information generation
- Safety-critical applications
Bias, Risks, and Limitations
- May reflect biases from training data
- Minor quality loss due to quantization
- KV benefits require compatible workflows
Recommendations
Use MXFP8 for better quality and NVFP4 for maximum performance. KV-cache is most effective in iterative workflows.
How to Get Started with the Model
Load in a KV-compatible pipeline (e.g., ComfyUI) and reuse reference images to benefit from KV-cache.
Training Details
Training Data
Inherited from FLUX.2 Klein 9B. No additional training.
Training Procedure
Post-training quantization:
- MXFP8
- NVFP4
- KV-cache enabled
Evaluation
Results
- Faster inference with KV-cache
- Reduced VRAM usage
- Small quality trade-offs depending on format
Technical Specifications
- ~9B parameters
- Rectified Flow Transformer
- 1024×1024 resolution
Model Card Authors
Winnougan
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support