Model Card for Klein-9B KV (MXFP8 / NVFP4)

Quantized KV-cache optimized variant of FLUX.2 Klein 9B for faster and more memory-efficient image generation.

Model Details

Model Description

Klein-9B KV (MXFP8 / NVFP4) is a quantized version of FLUX.2 Klein 9B with KV-cache support for improved performance in multi-reference and iterative workflows.

Developed by: Black Forest Labs (base), Winnougan (quantization)
Model type: Text-to-image / image-to-image generative model
License: Apache-2.0 (repo), base model license applies
Finetuned from model: FLUX.2-klein-9b-kv-fp8

Model Sources

Repository: https://huggingface.co/Winnougan/Klein-9b-kv-mxfp8

Uses

Direct Use

Text-to-image
Image-to-image
Multi-reference workflows (KV-cache)

Out-of-Scope Use

Factual or reliable information generation
Safety-critical applications

Bias, Risks, and Limitations

May reflect biases from training data
Minor quality loss due to quantization
KV benefits require compatible workflows

Recommendations

Use MXFP8 for better quality and NVFP4 for maximum performance. KV-cache is most effective in iterative workflows.

How to Get Started with the Model

Load in a KV-compatible pipeline (e.g., ComfyUI) and reuse reference images to benefit from KV-cache.

Training Details

Training Data

Inherited from FLUX.2 Klein 9B. No additional training.

Training Procedure

Post-training quantization:

MXFP8
NVFP4
KV-cache enabled

Evaluation

Results

Faster inference with KV-cache
Reduced VRAM usage
Small quality trade-offs depending on format

Technical Specifications

~9B parameters
Rectified Flow Transformer
1024×1024 resolution

Model Card Authors

Winnougan

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support