Qwen3-30B-A3B-w8a8-QuaRot-310

1. Basic Information

Item Information
Original Model Name Qwen3-30B-A3B
Original Model Link Qwen/Qwen3-30B-A3B
msmodelslim commit id 6a860e4a7b48b4573a8aeeaa12123d2bbc9ec9b8
msmodelslim User Guide Readme
Accuracy Test Hardware Atlas 300I DUO
Accuracy Test Platform MindIE Docker Image
Version MindIE 2.3.0

2. Quantization Command:

Model Sparse Quantization

python3 quant_qwen_moe_w8a8.py --model_path {floating-point weights path} \
--save_path {W8A8 quantized weights path} \
--anti_dataset ../common/qwen3-moe_anti_prompt_50.json \
--calib_dataset ../common/qwen3-moe_calib_prompt_50.json \
--trust_remote_code True \
--rot

3. Accuracy Test Results

Model Name Quantization Format Dataset Test Accuracy % Floating-Point Accuracy %
Qwen3-30B-A3B-w8a8-QuaRot-310 w8a8 BoolQ 88.01 88.01
Qwen3-30B-A3B-w8a8-QuaRot-310 w8a8 CEval 84.32 83.88
Qwen3-30B-A3B-w8a8-QuaRot-310 w8a8 GSM8K 94.54 94.31
  • Accuracy data was obtained from inference results in non-thinking mode.
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Chandanv1989/Qwen3-30B-A3B-w8a8-QuaRot-310

Quantized
(118)
this model