Noteworthy Models 2026 (22B-36B)
Interesting models in 2026 that can run decently with 24GB of VRAM (or a lot of patience)
Image-Text-to-Text • 33B • Updated • 4.47M • • 2.25kNote Excellent official model, after a very boring 2025 year, 2026 is shaping to be awesome. It's surprisingly compliant for a corporate model. It has some structural repetition issues, and it loves its em-dashes and slop sentences way too much, but this is definitely going to be very strong base for fine-tuners. Also, its SWA KV cache means that you can fit a lot more context in much less space that you'd normally need.
Qwen/Qwen3.5-27B
Image-Text-to-Text • 28B • Updated • 3.23M • • 952Note 2026's flavor of Qwen. Excellent for serious stuff, as usual. Plus at 28B, you can fit the vision layer without having to kill too much context or quantizing too hard. The thinking block tends to be massive, though.
zai-org/GLM-4.7-Flash
Text Generation • 31B • Updated • 683k • • 1.71kNote Damn it's fast and clever. I'm generally not a fan of MoE models, but this one is really, really, good. It will however eat a bazillion tokens in its thinking block. It is very bad at writing, and quite censored.
zerofata/Q3.5-BlueStar-v2-27B
27B • Updated • 83 • 36Note Creative model based on Qwen 3.5. Thinking and tool-calls are preserved (but model needs to be given orders). Strong contender for a mixed use case. Decent job, but still early. v3 is shaping to be a lot better.
zerofata/MS3.2-PaintedFantasy-v4.1-24B
24B • Updated • 37 • 11Note First RP model of the year in this list. Similar to previous iterations, but grounded by normal assistant prompts. It's a really solid model, fun to use, adaptable beyond RP too. It uses L7-Tekken as an instruct format. Supports thinking mode optionally.