--- base_model: Qwen/Qwen2.5-0.5B-Instruct tags: - text-generation-inference - transformers - unsloth - qwen2 license: apache-2.0 language: - en datasets: - quotientai/limbic-eval-tool-use-mcp --- # limbic-tool-use-0.5B-32K GGUF Models ## Model Generation Details This model was generated using [llama.cpp](https://github.com/ggerganov/llama.cpp) at commit [`c7f3169c`](https://github.com/ggerganov/llama.cpp/commit/c7f3169cd523140a288095f2d79befb20a0b73f4). --- Click here to get info on choosing the right GGUF model format --- # Limbic-Tool-Use MCP Function Call Evaluator This model is a fine-tuned version of Qwen2.5-0.5B-Instruct specifically designed for evaluating function calls in the context of Model Context Protocol (MCP) tools. It can assess whether a function call is correct, uses the wrong tool, has incorrect parameter names, or has incorrect parameter values. ## Model Details - **Base Model**: [Qwen/Qwen2.5-0.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct) - **Fine-tuning Method**: LoRA (Low-Rank Adaptation) - **Task**: Function Call Evaluation for MCP (Model Context Protocol) - **Training Data**: MCP Server Tools data from public MCP servers, with augmentation / synthetic data generation - **Model Size**: ~40MB (LoRA adapters only) - **Context Length**: 32,768 tokens # Model Usage ## Model Prompts The prompt for the model takes two inputs: - `available_tools` - a list of the tool schemas - `message_history` - the user request and model tool call response as a list of jsons ``` EVALUATOR_PROMPT = """\ # TOOL CALL EVALUATION RUBRIC ## EVALUATION CRITERIA ### 1. TOOL SELECTION - [ ] Function name exists in available tools - [ ] Function purpose matches user intent ### 2. PARAMETER STRUCTURE - [ ] All required and relevant parameters are present - [ ] No hallucinated parameter names - [ ] Parameter names match tool schema exactly ### 3. PARAMETER VALUES - [ ] Data types match expected types - [ ] Values align with user request - [ ] No fabricated or incorrect values ## CLASSIFICATION RULES - All criteria passed β†’ `correct` - Failed criteria 1 β†’ `incorrect_tool` - Failed criteria 2 β†’ `incorrect_parameter_names` - Failed criteria 3 β†’ `incorrect_parameter_values` --- ### AVAILABLE TOOLS {available_tools} --- ### MESSAGE HISTORY {message_history} --- ## OUTPUT REQUIREMENT {{ "score": < correct | incorrect_tool | incorrect_parameter_names | incorrect_parameter_values >, "reason": < [if incorrect, provide a brief list of reasons] > }} ### EVALUATION: """ ``` ``` SYSTEM_PROMPT = "You are an expert evaluator of function calls. You will be given a function call and a list of available tools. You will need to evaluate the function call and return a score and a reason for the score." ``` ### Example Inputs ``` available_tools = [ { "name": "google-play-developer", "description": "Get apps by a developer on Google Play", "input_schema": { "type": "object", "properties": { "devId": {"type": "string", "description": "Developer ID"}, "num": {"type": "number", "default": 60, "description": "Number of results"}, "lang": {"type": "string", "default": "en", "description": "Language code"}, "country": {"type": "string", "default": "us", "description": "Country code"} }, "required": ["devId"] } } ] message_history = [ {"role": "user", "content": "I'm looking to evaluate the performance of all the apps developed by 'Example Developer' on the Google Play Store. Could you provide me with a list of their recent applications, specifically in English and focused on the US market? Please limit the results to 50 apps for a quicker review."}, {"role": "assistant", "content": {"function": "name": "google-play-developer", "arguments": {"devId": "com.example.developer", "num": 50, "lang": "en", "country": "us"}}} ] ``` ## Output Format The model outputs evaluations in JSON format: ```json { "score": "correct|incorrect_tool|incorrect_parameter_names|incorrect_parameter_values", "reason": ["reasons for failure if incorrect"] } ``` #### Score Categories - **correct**: Function call matches available tools and parameters exactly - **incorrect_tool**: Function name doesn't exist in available tools - **incorrect_parameter_names**: Function exists but parameter names are wrong - **incorrect_parameter_values**: Function and parameters exist but values are inappropriate ## Load the Model ``` from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("quotientai/limbic-tool-use-0.5B-32K") model = AutoModelForCausalLM.from_pretrained("quotientai/limbic-tool-use-0.5B-32K") ``` ## Generate a Prediction To make a prediction, you must convert the formatted prompt into its chat format. ``` chat_template = [ {"role": "system", "content": SYSTEM_PROMPT}, {"role": "user", "content": ""} ] # Apply the chat template text = tokenizer.apply_chat_template(chat_template, tokenize=False, add_generation_prompt=True) # Tokenize with truncation inputs = tokenizer(text, return_tensors="pt", truncation=True).to("cuda") # Generate your prediction result = model.generate(**inputs, max_new_tokens=128, use_cache=True) ``` ## Citation ```bibtex @model{limbic-tool-use-0.5B-32K, title={Limbic Tool Use Evaluator}, author={QuotientAI}, year={2025}, url={https://huggingface.co/quotientai/limbic-tool-use-0.5B-32K} } ``` --- # πŸš€ If you find these models useful Help me test my **AI-Powered Quantum Network Monitor Assistant** with **quantum-ready security checks**: πŸ‘‰ [Quantum Network Monitor](https://readyforquantum.com/?assistant=open&utm_source=huggingface&utm_medium=referral&utm_campaign=huggingface_repo_readme) The full Open Source Code for the Quantum Network Monitor Service available at my github repos ( repos with NetworkMonitor in the name) : [Source Code Quantum Network Monitor](https://github.com/Mungert69). You will also find the code I use to quantize the models if you want to do it yourself [GGUFModelBuilder](https://github.com/Mungert69/GGUFModelBuilder) πŸ’¬ **How to test**: Choose an **AI assistant type**: - `TurboLLM` (GPT-4.1-mini) - `HugLLM` (Hugginface Open-source models) - `TestLLM` (Experimental CPU-only) ### **What I’m Testing** I’m pushing the limits of **small open-source models for AI network monitoring**, specifically: - **Function calling** against live network services - **How small can a model go** while still handling: - Automated **Nmap security scans** - **Quantum-readiness checks** - **Network Monitoring tasks** 🟑 **TestLLM** – Current experimental model (llama.cpp on 2 CPU threads on huggingface docker space): - βœ… **Zero-configuration setup** - ⏳ 30s load time (slow inference but **no API costs**) . No token limited as the cost is low. - πŸ”§ **Help wanted!** If you’re into **edge-device AI**, let’s collaborate! ### **Other Assistants** 🟒 **TurboLLM** – Uses **gpt-4.1-mini** : - **It performs very well but unfortunatly OpenAI charges per token. For this reason tokens usage is limited. - **Create custom cmd processors to run .net code on Quantum Network Monitor Agents** - **Real-time network diagnostics and monitoring** - **Security Audits** - **Penetration testing** (Nmap/Metasploit) πŸ”΅ **HugLLM** – Latest Open-source models: - 🌐 Runs on Hugging Face Inference API. Performs pretty well using the lastest models hosted on Novita. ### πŸ’‘ **Example commands you could test**: 1. `"Give me info on my websites SSL certificate"` 2. `"Check if my server is using quantum safe encyption for communication"` 3. `"Run a comprehensive security audit on my server"` 4. '"Create a cmd processor to .. (what ever you want)" Note you need to install a [Quantum Network Monitor Agent](https://readyforquantum.com/Download/?utm_source=huggingface&utm_medium=referral&utm_campaign=huggingface_repo_readme) to run the .net code on. This is a very flexible and powerful feature. Use with caution! ### Final Word I fund the servers used to create these model files, run the Quantum Network Monitor service, and pay for inference from Novita and OpenAIβ€”all out of my own pocket. All the code behind the model creation and the Quantum Network Monitor project is [open source](https://github.com/Mungert69). Feel free to use whatever you find helpful. If you appreciate the work, please consider [buying me a coffee](https://www.buymeacoffee.com/mahadeva) β˜•. Your support helps cover service costs and allows me to raise token limits for everyone. I'm also open to job opportunities or sponsorship. Thank you! 😊