Model Card for Klein-9B KV (MXFP8 / NVFP4)

Quantized KV-cache optimized variant of FLUX.2 Klein 9B for faster and more memory-efficient image generation.


Model Details

Model Description

Klein-9B KV (MXFP8 / NVFP4) is a quantized version of FLUX.2 Klein 9B with KV-cache support for improved performance in multi-reference and iterative workflows.

  • Developed by: Black Forest Labs (base), Winnougan (quantization)
  • Model type: Text-to-image / image-to-image generative model
  • License: Apache-2.0 (repo), base model license applies
  • Finetuned from model: FLUX.2-klein-9b-kv-fp8

Model Sources


Uses

Direct Use

  • Text-to-image
  • Image-to-image
  • Multi-reference workflows (KV-cache)

Out-of-Scope Use

  • Factual or reliable information generation
  • Safety-critical applications

Bias, Risks, and Limitations

  • May reflect biases from training data
  • Minor quality loss due to quantization
  • KV benefits require compatible workflows

Recommendations

Use MXFP8 for better quality and NVFP4 for maximum performance. KV-cache is most effective in iterative workflows.


How to Get Started with the Model

Load in a KV-compatible pipeline (e.g., ComfyUI) and reuse reference images to benefit from KV-cache.


Training Details

Training Data

Inherited from FLUX.2 Klein 9B. No additional training.


Training Procedure

Post-training quantization:

  • MXFP8
  • NVFP4
  • KV-cache enabled

Evaluation

Results

  • Faster inference with KV-cache
  • Reduced VRAM usage
  • Small quality trade-offs depending on format

Technical Specifications

  • ~9B parameters
  • Rectified Flow Transformer
  • 1024×1024 resolution

Model Card Authors

Winnougan

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support