YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

Qwopus3.6-27B-v2-INT4-W4A16-Autoround

Quantized version of Jackrong/Qwopus3.6-27B-v2 using Autoround algorithm with no calibrarion dataset

Quantization cmd:

auto-round --model Jackrong/Qwopus3.6-27B-v2 \
  --scheme "W4A16" \
  --format "auto_round" \
  --output_dir ./Jackrong_Qwopus3.6-27B-v2-INT4-W4A16-Autoround \
  --iters 1000 \
  --enable_torch_compile \
  --ignore_layers "model.language_model.embed_tokens,model.visual.*,mtp.*,input_layernorm,post_attention_layernorm,q_norm,k_norm,conv1d,linear_attn.norm"

Layers kept at FP16:

  • MTP.*
  • model.language_model.embed_tokens
  • model.visual.*
  • input_layernorm
  • post_attention_layernorm
  • q_norm
  • k_norm
  • conv1d
  • linear_attn.norm

Evaluation (WIP)

GSM-8K (Full dataset - 1319 samples)

lm_eval --model vllm \
  --model_args pretrained=<MODEL_PATH>,tensor_parallel_size=1,dtype=auto,gpu_memory_utilization=0.95 \
  --tasks gsm8k \
  --gen_kwargs temperature=1.0 top_p=0.95 max_completion_tokens=1024 top_k=20 min_p=0.0 presence_penalty=0.0 repetition_penalty=1.0
Model Size (GB) Accuracy (flexible-extract / strict-match)
Qwen/Qwen3.6-27B 0.6649 / 0.6732
Jackrong/Qwopus3.6-27B-v2 52 0.8256 / 0.8370
XReyRobert/Qwopus3.6-27B-v2-GPTQ-Pro-v1 18 0.7491 / 0.8127
mconcat/Qwopus3.6-27B-v2-AWQ-4bit 26 0.7195 / 0.7331
JC1DA/Qwopus3.6-27B-v2-INT4-W4A16-Autoround 19 0.7832 / 0.8036

MMLU-PRO+ (coming soon)

Downloads last month
7,184
Safetensors
Model size
6B params
Tensor type
I32
BF16
F16
Inference Providers NEW
This model isn't deployed by any Inference Provider. 馃檵 Ask for provider support