Agent Nemo — Qwen2-VL-2B-Instruct (Fine-tuned GGUF)

Fine-tuned Qwen2-VL-2B-Instruct for autonomous web navigation, quantized to GGUF Q4_K_M for edge deployment.

Training

Base model: Qwen2-VL-2B-Instruct
Method: QLoRA (r=16, alpha=16)
Dataset: Mind2Web (~7,775 conversations)
Task: Web navigation — screenshot + AXTree → action JSON
LoRA adapters: yashsikdar/agent-nemo-qwen2vl-lora

Files

File	Size	Description
`agent-nemo-qwen2vl-q4_k_m.gguf`	~1.4 GB	Language model (Q4_K_M quantization)
`mmproj-agent-nemo-qwen2vl-f16.gguf`	~600 MB	Vision encoder (F16)

Usage with llama-cpp-python

from llama_cpp import Llama
from llama_cpp.llama_chat_format import Qwen25VLChatHandler

handler = Qwen25VLChatHandler(clip_model_path="mmproj-agent-nemo-qwen2vl-f16.gguf")
llm = Llama(model_path="agent-nemo-qwen2vl-q4_k_m.gguf", chat_handler=handler, n_ctx=4096)

Download

# Using the Agent Nemo download script:
./scripts/download_model.sh

# Or manually:
huggingface-cli download yashsikdar/agent-nemo-qwen2vl-gguf --local-dir ~/.agent-nemo/models/

Downloads last month: 3

GGUF

Model size

2B params

Architecture

qwen2vl

Hardware compatibility

4-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for yashsikdar/agent-nemo-qwen2vl-gguf

Base model

Qwen/Qwen2-VL-2B

Finetuned

Qwen/Qwen2-VL-2B-Instruct

Quantized

(52)

this model