Huihui-Qwen3.5-4B-Claude-4.6-Opus-abliterated โ€” GGUF

GGUF conversion of huihui-ai/Huihui-Qwen3.5-4B-Claude-4.6-Opus-abliterated for use with llama.cpp.

Credits

Role Model / Author
Base LLM Qwen/Qwen3.5-4B โ€” Alibaba Qwen Team
Abliterated (uncensored) huihui-ai/Huihui-Qwen3.5-4B-Claude-4.6-Opus-abliterated โ€” Huihui AI
GGUF Conversion hotdogs โ€” via llama.cpp

๐Ÿ™ Huge thanks to Qwen Team (Alibaba) for the base model, Huihui AI for the abliteration, and ggerganov for llama.cpp!

Model Details

Spec Value
Parameters ~4B
Architecture Qwen3.5 Multimodal (QWEN35)
hiddensize 2560
Layers 32
Attention Heads 16 (KV: 4)
Context Length 262,144 (256K tokens)
FFN Intermediate 9216
Vision Encoder 24 layers, hiddensize=1024, patchsize=16
Modality image-text-to-text ๐Ÿ–ผ๏ธโžก๏ธ๐Ÿ“
Censorship Abliterated (refusal direction removed)
License Apache 2.0

Available Quantizations

File Size BPW Quality Recommended For
huihui-qwen35-4b-BF16.gguf 7.9 GB 16.00 โญ Full Best quality, 16GB+ VRAM
huihui-qwen35-4b-Q8_0.gguf 4.2 GB ~8.00 โญ Very High Balanced, 8GB+ VRAM
huihui-qwen35-4b-Q4_K_M.gguf 2.6 GB 5.13 โญ Good Low VRAM, 6GB+ VRAM
mmproj-huihui-qwen35-4b-BF16.gguf 645 MB โ€” Vision Multimodal projector (required for images)

Usage

Text-only

./llama-cli -m huihui-qwen35-4b-Q4_K_M.gguf -p "Hello!" -n 256

Multimodal (image + text)

./llama-qwen2vl-cli -m huihui-qwen35-4b-Q4_K_M.gguf --mmproj mmproj-huihui-qwen35-4b-BF16.gguf --image photo.jpg -p "Describe this image"

Server (OpenAI-compatible API)

./llama-server -m huihui-qwen35-4b-Q4_K_M.gguf --mmproj mmproj-huihui-qwen35-4b-BF16.gguf --host 0.0.0.0 --port 8080

Python (llama-cpp-python)

llm = Llama(model_path="huihui-qwen35-4b-Q4_K_M.gguf", n_ctx=32768) output = llm("Hello!", max_tokens=128)

About Abliteration

This model has undergone directional ablation โ€” a technique that removes the "refusal direction" from the model's activation space (Arditi et al. 2024). The model will not refuse questions that base Qwen3.5 would normally decline.

Use responsibly. Ensure your use case complies with applicable laws.

Conversion Notes

  • Converted with llama.cpp convert_hf_to_gguf.py
  • BF16 output type
  • QWEN35 architecture, Qwen3VLVisionModel for mmproj
  • Metadata preserved from source model
Downloads last month
1,255
GGUF
Model size
4B params
Architecture
qwen35
Hardware compatibility
Log In to add your hardware

4-bit

6-bit

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for hotdogs/Huihui-Qwen3.5-4B-Claude-4.6-Opus-abliterated-GGUF

Finetuned
Qwen/Qwen3.5-4B
Quantized
(229)
this model