vantagewithai
/

FLUX.2-klein-9b-kv-GGUF

+---
+language:
+- en
+license: other
+license_name: flux-non-commercial-license
+tags:
+- image-generation
+- image-editing
+- flux
+- diffusion-single-file
+pipeline_tag: image-to-image
+library_name: diffusers
+---
+**Quantized GGUF version of FLUX.2-klein-9b-kv**
+**Original model Link:** [https://huggingface.co/black-forest-labs/FLUX.2-klein-9b-kv](https://huggingface.co/black-forest-labs/FLUX.2-klein-9b-kv)
+**Watch us at Youtube:** [@VantageWithAI](https://www.youtube.com/@vantagewithai)
+![Teaser](https://huggingface.co/black-forest-labs/FLUX.2-klein-9b-kv/resolve/main/realism.jpg)
+![Teaser](https://huggingface.co/black-forest-labs/FLUX.2-klein-9b-kv/resolve/main/editing.jpg)
+![Teaser](https://huggingface.co/black-forest-labs/FLUX.2-klein-9b-kv/resolve/main/others.jpg)
+`FLUX.2 [klein] 9B-KV` is an optimized variant of FLUX.2 [klein] 9B with **KV-cache support for accelerated multi-reference editing**. This variant caches key-value pairs from reference images during the first denoising step, eliminating redundant computation in subsequent steps for significantly faster multi-image editing workflows.
+For more information about FLUX.2 [klein], please read our [blog post](https://bfl.ai/blog/flux2-klein-towards-interactive-visual-intelligence).
+# **Key Features**
+1. **KV-Cache Optimization**: Reference image KV pairs are computed once and cached, reducing computation and speeding up inference by up to 2.5 times for multi-reference editing tasks.
+2. All capabilities of FLUX.2 [klein] 9B: sub-second generation, text-to-image, and multi-reference editing in a single unified model.
+3. Ideal for interactive applications and real-time editing pipelines where the same reference images are used across multiple generations.
+4. 9B flow model with 8B Qwen3 text embedder, step-distilled to 4 inference steps.
+5. Available for non-commercial use.
+# **How KV-Caching Works**
+In standard image editing, reference image tokens are processed at every denoising step. With KV-caching:
+- **Step 0**: Full forward pass processes reference tokens and extracts their key-value pairs into a cache.
+- **Steps 1-3**: Cached KV pairs are reused, skipping redundant reference token computation.
+This is particularly beneficial when:
+- Editing with multiple reference images
+- Generating variations with the same references
+- Building interactive editing applications
+# **Usage**
+We provide a reference implementation in our [GitHub repository](https://github.com/black-forest-labs/flux2).
+## **API Endpoints**
+FLUX.2 [klein] 9B-KV is available via the BFL API at [bfl.ai](https://bfl.ai).
+## **Using with Diffusers 🧨**
+To use FLUX.2 [klein] 9B-KV with the 🧨 Diffusers python library, first install or upgrade diffusers:
+```shell
+pip install git+https://github.com/huggingface/diffusers.git
+```
+Then you can use Flux2KleinKVPipeline to run the model:
+```python
+import torch
+from diffusers import Flux2KleinKVPipeline
+device = "cuda"
+dtype = torch.bfloat16
+model_path = "black-forest-labs/FLUX.2-klein-9b-kv"
+pipe = Flux2KleinKVPipeline.from_pretrained(model_path, torch_dtype=dtype)
+pipe.to(device)
+# Text-to-image (no reference image)
+print("Generating text-to-image...")
+image = pipe(
+    prompt="A cat holding a sign that says hello world",
+    height=1024,
+    width=1024,
+    num_inference_steps=4,
+    generator=torch.Generator(device=device).manual_seed(0),
+).images[0]
+image.save("t2i_output.png")
+print("Saved t2i_output.png")
+# Image-to-image with KV cache (using the generated image as reference)
+print("Generating image-to-image with KV cache...")
+image_kv = pipe(
+    prompt="A cat dressed like a wizard",
+    image=image,
+    height=1024,
+    width=1024,
+    num_inference_steps=4,
+    generator=torch.Generator(device=device).manual_seed(0),
+).images[0]
+image_kv.save("kv_output.png")
+print("Saved kv_output.png")
+```
+---
+# Limitations
+- This model is not intended or able to provide factual information.
+- While the model can output text, text rendered may be inaccurate or subject to distortion.
+- As a statistical model, this checkpoint may represent or amplify biases observed in the training data.
+- The model may fail to generate output that matches the prompts.
+- Prompt following is heavily influenced by the prompting style.
+# Out-of-Scope Use
+This model and its derivatives may not be used outside the scope of the license, including for unlawful, fraudulent, defamatory, abusive, or otherwise violative purposes as further explained in our Usage Policies.
+# Hardware
+The FLUX.2 [klein] 9B-KV model fits in ~29GB VRAM and is accessible on NVIDIA RTX 5090 and above.
+---
+# Responsible AI Development
+Black Forest Labs is committed to responsible model development and deployment. Prior to releasing FLUX.2 [klein] 9B-KV, we evaluated and mitigated a number of risks, including child sexual abuse material (CSAM) and nonconsensual intimate imagery (NCII). For detailed information about our mitigations, evaluation processes, content provenance features, and policies, please see our post: [Capable, Open, and Safe: Combating AI
+Misuse](https://bfl.ai/blog/capable-open-and-safe-combating-ai-misuse).
+To report safety concerns, contact safety@blackforestlabs.ai.
+---
+# License
+This model falls under the [FLUX Non-Commercial License](https://huggingface.co/black-forest-labs/FLUX.2-klein-9b-kv-fp8/blob/main/LICENSE).
+# Trademarks & IP
+This project may contain trademarks or logos for projects, products, or services. Use of Black Forest Labs and FLUX trademarks or logos in modified versions of this project must not cause confusion or imply sponsorship or endorsement. Any use of third-party trademarks, intellectual property or logos are subject to those third-party's policies.