Qwen3-30B-A3B-w8a8-QuaRot-310

1. Basic Information

Item	Information
Original Model Name	Qwen3-30B-A3B
Original Model Link	Qwen/Qwen3-30B-A3B
msmodelslim commit id	6a860e4a7b48b4573a8aeeaa12123d2bbc9ec9b8
msmodelslim User Guide	Readme
Accuracy Test Hardware	Atlas 300I DUO
Accuracy Test Platform	MindIE Docker Image
Version	MindIE 2.3.0

2. Quantization Command:

Model Sparse Quantization

python3 quant_qwen_moe_w8a8.py --model_path {floating-point weights path} \
--save_path {W8A8 quantized weights path} \
--anti_dataset ../common/qwen3-moe_anti_prompt_50.json \
--calib_dataset ../common/qwen3-moe_calib_prompt_50.json \
--trust_remote_code True \
--rot

3. Accuracy Test Results

Model Name	Quantization Format	Dataset	Test Accuracy %	Floating-Point Accuracy %
Qwen3-30B-A3B-w8a8-QuaRot-310	w8a8	BoolQ	88.01	88.01
Qwen3-30B-A3B-w8a8-QuaRot-310	w8a8	CEval	84.32	83.88
Qwen3-30B-A3B-w8a8-QuaRot-310	w8a8	GSM8K	94.54	94.31

Accuracy data was obtained from inference results in non-thinking mode.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Chandanv1989/Qwen3-30B-A3B-w8a8-QuaRot-310

Base model

Qwen/Qwen3-30B-A3B-Base

Finetuned

Qwen/Qwen3-30B-A3B

Quantized

(118)

this model