This repository contains a nvfp4(w4a4) quantized version of https://huggingface.co/ArliAI/GLM-4.5-Air-Derestricted.

Prerequisites:

To run this model successfully, you must meet the following software requirements:

vLLM: Version 0.11.1 or higher.

compressed-tensors: Version 0.13.0 or higher. Model quantized by llm-compressor>=v0.9.0 added some additional properties in config.json which is not compatible with compressed-tensors < v0.13.0

Downloads last month
-
Safetensors
Model size
63B params
Tensor type
BF16
·
F32
·
F8_E4M3
·
U8
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for gesong2077/GLM-4.5-Air-Derestricted-NVFP4

Quantized
(23)
this model