Spaces:

ServiceNow
/

browsergym-leaderboard

Running

File size: 2,339 Bytes

66b7a63

### A3-Qwen3.5-9B

This agent is [GenericAgent](https://github.com/ServiceNow/AgentLab/blob/main/src/agentlab/agents/generic_agent/generic_agent.py) from [AgentLab](https://github.com/ServiceNow/AgentLab), fine-tuned using the Agent-as-Annotators (A3) pipeline.

- **Model Name:** A3-Qwen3.5-9B
- **Base Model:** Qwen/Qwen3.5-9B
- **Model Architecture:**
  - Type: Vision-Language Model (VLM)
  - Architecture: Causal LM with vision encoder
  - Number of Parameters: 9B
- **Input/Output Format:**
  - Input: Accessibility tree + Set-of-Mark (SoM) screenshot
  - Output: Text action in BrowserGym format
  - Flags:
    ```python
    GenericPromptFlags(
        obs=ObsFlags(
            use_html=False,
            use_ax_tree=True,
            use_tabs=True,
            use_focused_element=True,
            use_error_logs=True,
            use_history=True,
            use_past_error_logs=False,
            use_action_history=True,
            use_think_history=False,
            use_diff=False,
            html_type='pruned_html',
            use_screenshot=True,
            use_som=True,
            extract_visible_tag=True,
            extract_clickable_tag=True,
            extract_coords='False',
            filter_visible_elements_only=False,
        ),
        action=ActionFlags(
            action_set=HighLevelActionSetArgs(
                subsets=('webarena',),
                multiaction=False,
                strict=False,
                retry_with_force=True,
                demo_mode='off',
            ),
            long_description=False,
            individual_examples=False,
        ),
        use_plan=False,
        use_criticise=False,
        use_thinking=True,
        use_memory=False,
        use_concrete_example=True,
        use_abstract_example=True,
        use_hints=True,
        enable_chat=False,
        max_prompt_tokens=57344,
        be_cautious=True,
        extra_instructions=None,
    )
    ```
- **Training Details:**
  - Dataset: WebSynth trajectories collected via the A3 pipeline (agent-generated annotations on real websites)
  - Fine-tuning method: Supervised Fine-Tuning (SFT) with FSDP
  - Temperature at inference: 0.6
- **Paper Link:** (forthcoming — COLM 2026 submission)
- **Code Repository:** https://github.com/McGill-NLP/llm-annotators
- **License:** Apache-2.0