Quantized version of darkc0de/Hermes-4.3-36B-heretic. This is an abliteration method that doesn't seem to be as damaging to the base model as the previous methods were.

The repo includes the following quantized file:

For cards with 24GB of VRAM

  • IQ4_XS (with iMatrix): This should allow you to run this with 24GB VRAM at 16K context length.

Settings

Instruction Template: Llama3-Chat Thinker

See Hermes's Official Readme for complete documentation and references.

Downloads last month
29
GGUF
Model size
36B params
Architecture
seed_oss
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for SerialKicked/Hermes-4.3-36B-heretic-GGUF-IQ4_XS

Quantized
(1)
this model