Instructions to use ishaanranjan/slm-agent-action-router-qwen2-5-0-5b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use ishaanranjan/slm-agent-action-router-qwen2-5-0-5b with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="ishaanranjan/slm-agent-action-router-qwen2-5-0-5b") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForMultimodalLM tokenizer = AutoTokenizer.from_pretrained("ishaanranjan/slm-agent-action-router-qwen2-5-0-5b") model = AutoModelForMultimodalLM.from_pretrained("ishaanranjan/slm-agent-action-router-qwen2-5-0-5b") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use ishaanranjan/slm-agent-action-router-qwen2-5-0-5b with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "ishaanranjan/slm-agent-action-router-qwen2-5-0-5b" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "ishaanranjan/slm-agent-action-router-qwen2-5-0-5b", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/ishaanranjan/slm-agent-action-router-qwen2-5-0-5b
- SGLang
How to use ishaanranjan/slm-agent-action-router-qwen2-5-0-5b with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "ishaanranjan/slm-agent-action-router-qwen2-5-0-5b" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "ishaanranjan/slm-agent-action-router-qwen2-5-0-5b", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "ishaanranjan/slm-agent-action-router-qwen2-5-0-5b" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "ishaanranjan/slm-agent-action-router-qwen2-5-0-5b", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use ishaanranjan/slm-agent-action-router-qwen2-5-0-5b with Docker Model Runner:
docker model run hf.co/ishaanranjan/slm-agent-action-router-qwen2-5-0-5b
Agent Action Router (Qwen2.5 0.5B)
This is a full-parameter supervised fine-tune of
Qwen/Qwen2.5-0.5B-Instruct for one narrow,
schema-bound developer-agent subroutine:
Choose the next harness action from a noisy progress report.
The model is one cell from the Parameter Floors for Developer-Agent Subroutines experiment. Labels are generated by deterministic oracles over real Python repositories; no teacher model or human judge labels the data.
Intended Use
Use this checkpoint inside the repository's verified subroutine harness, which renders the task-specific prompt, parses strict JSON, permits one localized schema-feedback retry, applies deterministic guards, and falls back to rules where appropriate. This is not a general coding assistant or chat model.
Evaluation
Evaluation uses up to 250 examples from HTTPX and Jinja2, both held out entirely from training. Decoding is greedy.
| Metric | Result |
|---|---|
| Success after one schema retry | 96.0% |
| First-pass success | 96.0% |
| First-pass schema validity | 100.0% |
| Base instruct success after retry | 27.6% for the base instruct model |
| Rules-only success | 27.7% |
Experiment verdict for this subroutine: works at 494M.
Training
- Training examples: 2000
- Epochs: 3.0
- Learning rate: 2e-05
- Effective batch configuration: 16 per device x 2 gradient accumulation
- Maximum sequence length: 2048
- Seed: 0
- Final training loss: 0.177046
- Reproduction hardware: one NVIDIA A100 80GB PCIe
- Source revision:
d0fd7bf
The dataset was generated from pinned Flask, Click, and Rich repositories for training/validation. HTTPX and Jinja2 were reserved for testing.
Limitations
The checkpoint is specialized to one closed JSON schema and should not be expected to retain broad instruction-following ability. The experiment mixes two base-model families across its size sweep. Some subroutines are better served by deterministic rules; consult the verdict above before deployment.
License
Apache-2.0, following the base model. Experiment code is MIT licensed.
- Downloads last month
- -
Model tree for ishaanranjan/slm-agent-action-router-qwen2-5-0-5b
Collection including ishaanranjan/slm-agent-action-router-qwen2-5-0-5b
Evaluation results
- Success after one schema-feedback retry on Held-out HTTPX and Jinja2 oracle benchmarkself-reported0.960
- First-pass schema validity on Held-out HTTPX and Jinja2 oracle benchmarkself-reported1.000