Agent Nemo β€” Qwen2-VL-2B-Instruct (Fine-tuned GGUF)

Fine-tuned Qwen2-VL-2B-Instruct for autonomous web navigation, quantized to GGUF Q4_K_M for edge deployment.

Training

  • Base model: Qwen2-VL-2B-Instruct
  • Method: QLoRA (r=16, alpha=16)
  • Dataset: Mind2Web (~7,775 conversations)
  • Task: Web navigation β€” screenshot + AXTree β†’ action JSON
  • LoRA adapters: yashsikdar/agent-nemo-qwen2vl-lora

Files

File Size Description
agent-nemo-qwen2vl-q4_k_m.gguf ~1.4 GB Language model (Q4_K_M quantization)
mmproj-agent-nemo-qwen2vl-f16.gguf ~600 MB Vision encoder (F16)

Usage with llama-cpp-python

from llama_cpp import Llama
from llama_cpp.llama_chat_format import Qwen25VLChatHandler

handler = Qwen25VLChatHandler(clip_model_path="mmproj-agent-nemo-qwen2vl-f16.gguf")
llm = Llama(model_path="agent-nemo-qwen2vl-q4_k_m.gguf", chat_handler=handler, n_ctx=4096)

Download

# Using the Agent Nemo download script:
./scripts/download_model.sh

# Or manually:
huggingface-cli download yashsikdar/agent-nemo-qwen2vl-gguf --local-dir ~/.agent-nemo/models/
Downloads last month
3
GGUF
Model size
2B params
Architecture
qwen2vl
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for yashsikdar/agent-nemo-qwen2vl-gguf

Base model

Qwen/Qwen2-VL-2B
Quantized
(52)
this model