AMD Ryzen AI Max+ 395 Strix Halo

adamelliotfields 's Collections

Small Language Models

Favorites

Image Generation

CV Papers

Papers

updated 16 days ago

Quantized models benchmarked with Windows ROCm llama.cpp builds from Lemonade using recommended settings from Unsloth.

Upvote

unsloth/LFM2.5-1.2B-Instruct-GGUF

Text Generation • 1B • Updated Jan 6 • 4.98k • 38

Note 145 t/s @ Q8_0. Surprisingly capable in chat. Not usable in OpenCode.
ggml-org/gpt-oss-20b-GGUF

21B • Updated Oct 30, 2025 • 85.6k • 142

Note 60 t/s @ MXFP4. OpenCode tools work. Prefer 120B.
mradermacher/Nanbeige4.1-3B-GGUF

4B • Updated Feb 12 • 2.63k • 37

Note 51 t/s @ Q8_0. Thinks for minutes. Not usable in OpenCode.
unsloth/GLM-4.7-Flash-GGUF

Text Generation • 30B • Updated Feb 12 • 127k • 601

Note 45 t/s @ Q8_0. OpenCode tool calling works great. Made a nice looking 400-line OpenMeteo weather app with typeahead search. Required manual TypeScript error fixes to run. Note that the smaller REAP model wasn't faster.
bartowski/moonshotai_Kimi-Linear-48B-A3B-Instruct-GGUF

Text Generation • 49B • Updated Feb 9 • 3.46k • 19

Note 45 t/s @ Q8_0. Excellent OpenCode tool calling including todo list and ask question. Made a 600-line OpenMeteo weather app with no errors. Note that it did everything the frontend-design skill said NOT to do, resulting in a comically bad looking app. Still, most usable local model on this list.
unsloth/Qwen3.5-35B-A3B-GGUF

Image-Text-to-Text • 35B • Updated Mar 5 • 1.34M • 818

Note TBD
Intel/Qwen3.5-122B-A10B-gguf-q2ks-mixed-AutoRound

122B • Updated 24 days ago • 871 • 5

Note TBD
ggml-org/gpt-oss-120b-GGUF

117B • Updated Oct 30, 2025 • 341k • 71

Note 42 t/s @ MXFP4. Good OpenCode tool calling, writes working TypeScript, but even the frontend-design skill can't get it to make attractive websites. Feels like GPT-4o, which is nice for nostalgia.
unsloth/Qwen3-Coder-Next-GGUF

Text Generation • 80B • Updated Mar 6 • 207k • 578

Note 32 t/s @ Q8_0. Was not able to build a working OpenMeteo weather app. Struggled with the edit tool attempting to fix errors. Was not able to properly trace errors in the code.
Intel/MiniMax-M2-REAP-172B-A10B-gguf-q2ks-mixed-AutoRound

173B • Updated Feb 9 • 164 • 13

Note 26 t/s @ Q2_K_S. Good chat performance. Didn't try in OpenCode. Interesting quantization.
Jackrong/Qwen3.5-9B-Claude-4.6-Opus-Reasoning-Distilled-v2-GGUF

Image-Text-to-Text • 9B • Updated 10 days ago • 263k • 291

Note 22 t/s @ Q8_0. Perfect tool calling in OpenCode. Fetched the OpenMeteo API schema from GitHub, initialized a Vite project, and made a multi-file React SPA with no errors. Outputs a stray closing /think tag after every response.
unsloth/Devstral-Small-2-24B-Instruct-2512-GGUF

24B • Updated Dec 15, 2025 • 35.6k • 127

Note 9 t/s @ Q8_0. All dense models are slow on Strix Halo. Speculative decoding (ngram-mod) works very well when it kicks in.
Jackrong/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled-v2-GGUF

Image-Text-to-Text • 27B • Updated 10 days ago • 398k • 567

Note 8/ts @ Q8_0. Too slow to try OpenCode. Notably didn't overthink on anything. Otherwise, same comments as Devstral 24B.

Upvote