YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
Qwopus3.6-27B-v2-INT4-W4A16-Autoround
Quantized version of Jackrong/Qwopus3.6-27B-v2 using Autoround algorithm with no calibrarion dataset
Quantization cmd:
auto-round --model Jackrong/Qwopus3.6-27B-v2 \
--scheme "W4A16" \
--format "auto_round" \
--output_dir ./Jackrong_Qwopus3.6-27B-v2-INT4-W4A16-Autoround \
--iters 1000 \
--enable_torch_compile \
--ignore_layers "model.language_model.embed_tokens,model.visual.*,mtp.*,input_layernorm,post_attention_layernorm,q_norm,k_norm,conv1d,linear_attn.norm"
Layers kept at FP16:
- MTP.*
- model.language_model.embed_tokens
- model.visual.*
- input_layernorm
- post_attention_layernorm
- q_norm
- k_norm
- conv1d
- linear_attn.norm
Evaluation (WIP)
GSM-8K (Full dataset - 1319 samples)
lm_eval --model vllm \
--model_args pretrained=<MODEL_PATH>,tensor_parallel_size=1,dtype=auto,gpu_memory_utilization=0.95 \
--tasks gsm8k \
--gen_kwargs temperature=1.0 top_p=0.95 max_completion_tokens=1024 top_k=20 min_p=0.0 presence_penalty=0.0 repetition_penalty=1.0
| Model | Size (GB) | Accuracy (flexible-extract / strict-match) |
|---|---|---|
| Qwen/Qwen3.6-27B | 0.6649 / 0.6732 | |
| Jackrong/Qwopus3.6-27B-v2 | 52 | 0.8256 / 0.8370 |
| XReyRobert/Qwopus3.6-27B-v2-GPTQ-Pro-v1 | 18 | 0.7491 / 0.8127 |
| mconcat/Qwopus3.6-27B-v2-AWQ-4bit | 26 | 0.7195 / 0.7331 |
| JC1DA/Qwopus3.6-27B-v2-INT4-W4A16-Autoround | 19 | 0.7832 / 0.8036 |
MMLU-PRO+ (coming soon)
- Downloads last month
- 7,184
Inference Providers NEW
This model isn't deployed by any Inference Provider. 馃檵 Ask for provider support