This repository contains a nvfp4(w4a4) quantized version of https://huggingface.co/ArliAI/GLM-4.5-Air-Derestricted.
Prerequisites:
To run this model successfully, you must meet the following software requirements:
vLLM: Version 0.11.1 or higher.
compressed-tensors: Version 0.13.0 or higher. Model quantized by llm-compressor>=v0.9.0 added some additional properties in config.json which is not compatible with compressed-tensors < v0.13.0
- Downloads last month
- -
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support