Qwen3-30B-A3B-w8a8-QuaRot-310
1. Basic Information
| Item | Information |
|---|---|
| Original Model Name | Qwen3-30B-A3B |
| Original Model Link | Qwen/Qwen3-30B-A3B |
| msmodelslim commit id | 6a860e4a7b48b4573a8aeeaa12123d2bbc9ec9b8 |
| msmodelslim User Guide | Readme |
| Accuracy Test Hardware | Atlas 300I DUO |
| Accuracy Test Platform | MindIE Docker Image |
| Version | MindIE 2.3.0 |
2. Quantization Command:
Model Sparse Quantization
python3 quant_qwen_moe_w8a8.py --model_path {floating-point weights path} \
--save_path {W8A8 quantized weights path} \
--anti_dataset ../common/qwen3-moe_anti_prompt_50.json \
--calib_dataset ../common/qwen3-moe_calib_prompt_50.json \
--trust_remote_code True \
--rot
3. Accuracy Test Results
| Model Name | Quantization Format | Dataset | Test Accuracy % | Floating-Point Accuracy % |
|---|---|---|---|---|
| Qwen3-30B-A3B-w8a8-QuaRot-310 | w8a8 | BoolQ | 88.01 | 88.01 |
| Qwen3-30B-A3B-w8a8-QuaRot-310 | w8a8 | CEval | 84.32 | 83.88 |
| Qwen3-30B-A3B-w8a8-QuaRot-310 | w8a8 | GSM8K | 94.54 | 94.31 |
- Accuracy data was obtained from inference results in non-thinking mode.
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support