Beinsezii
/

GLM-4.5-Air-Derestricted-IQ4F-Q8A-Q8SH-GGUF

Model card Files Files and versions

IQ4F : IQ4_XS feed-forawrd (IQ4_NL for ffn_down due to shape constraints)
Q8A : Q8_0 attention, Q8_0 output, Q8_0 embeds
Q8SH : Q8_0 shared experts

Readable speeds on a 24GiB GPU + 64GB RAM w/ long context

Downloads last month: 14

GGUF

Model size

110B params

Architecture

glm4moe

Hardware compatibility

Log In to add your hardware

We're not able to determine the quantization variants.

View all variants

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Beinsezii/GLM-4.5-Air-Derestricted-IQ4F-Q8A-Q8SH-GGUF

Base model

zai-org/GLM-4.5-Air

Finetuned

ArliAI/GLM-4.5-Air-Derestricted

Quantized

(23)

this model