Disty0
/

FLUX.2-klein-4B-SDNQ-4bit-dynamic

Flux2KleinPipeline

4-bit precision

Model card Files Files and versions

FLUX.2-klein-4B-SDNQ-4bit-dynamic / README.md

Disty0's picture

Update README.md

4542b6f verified 3 months ago

|

2.34 kB

	---
	license: other
	license_name: flux-non-commercial-license
	license_link: LICENSE.md
	base_model:
	- black-forest-labs/FLUX.2-klein-4B
	base_model_relation: quantized
	library_name: diffusers
	tags:
	- sdnq
	- flux
	- 4-bit
	---
	Dynamic 4 bit quantization of [black-forest-labs/FLUX.2-klein-4B](https://huggingface.co/black-forest-labs/FLUX.2-klein-4B) using [SDNQ](https://github.com/Disty0/sdnq).

	This model uses per layer fine grained quantization.
	What dtype to use for a layer is selected dynamically by trial and error until the std normalized mse loss is lower than the selected threshold.

	Minimum allowed dtype is set to uint4 and std normalized mse loss threshold is set to 1e-2.
	This created a mixed precision model with uint4 and int5 dtypes.
	SVD quantization is disabled.

	Usage:
	```
	pip install sdnq
	```

	```py
	import torch
	import diffusers
	from sdnq import SDNQConfig # import sdnq to register it into diffusers and transformers
	from sdnq.common import use_torch_compile as triton_is_available
	from sdnq.loader import apply_sdnq_options_to_model

	pipe = diffusers.Flux2KleinPipeline.from_pretrained("Disty0/FLUX.2-klein-4B-SDNQ-4bit-dynamic", torch_dtype=torch.bfloat16)

	# Enable INT8 MatMul for AMD, Intel ARC and Nvidia GPUs:
	if triton_is_available and (torch.cuda.is_available() or torch.xpu.is_available()):
	pipe.transformer = apply_sdnq_options_to_model(pipe.transformer, use_quantized_matmul=True)
	pipe.text_encoder = apply_sdnq_options_to_model(pipe.text_encoder, use_quantized_matmul=True)
	# pipe.transformer = torch.compile(pipe.transformer) # optional for faster speeds

	pipe.enable_model_cpu_offload()

	prompt = "A cat holding a sign that says hello world"
	image = pipe(
	prompt=prompt,
	height=1024,
	width=1024,
	guidance_scale=1.0,
	num_inference_steps=4,
	generator=torch.manual_seed(0)
	).images[0]

	image.save("flux-klein-sdnq-4bit-dynamic.png")
	```

	Original BF16 vs SDNQ quantization comparison:

	\| Quantization \| Model Size \| Visualization \|
	\| --- \| --- \| --- \|
	\| Original BF16 \| 7.8 GB \| ![Original BF16](https://cdn-uploads.huggingface.co/production/uploads/6456af6195082f722d178522/Eu8tAf8M-HRMBtPm5mgrw.png) \|
	\| SDNQ 4 Bit \| 2.5 GB \| ![SDNQ UINT4](https://cdn-uploads.huggingface.co/production/uploads/6456af6195082f722d178522/RGBdjs--EmSvWFuhovwEe.png) \|