π©Ί ZeroTime-Bot: Medical Triage Alignment
Problem: Standard AI models often "over-triage" (e.g., calling a stubbed toe an emergency) due to safety-bias in training data. Solution: Used GRPO (Reinforcement Learning) to align a Llama-3.1 8B model to recognize clinical nuances between Level 1 (Emergency) and Level 3 (Non-Urgent).
π Quick Start (Local Run)
- Install Ollama.
- Download the
medical_triage.gguffrom my [Hugging Face Link]. - Run:
ollama create medicalbot -f Modelfile - Run:
ollama run medicalbot
π Results: Before vs. After
| Scenario | Base Llama-3.1 | My Aligned Model | Result |
|---|---|---|---|
| Stubbed Toe | Level 1 (Emergency) | Level 3 (Non-Urgent) | β Fixed Bias |
| Chest Pain | Level 1 (Emergency) | Level 1 (Emergency) | β Kept Safety |
π οΈ Technical Approach
Instead of standard fine-tuning (SFT), we utilized Group Relative Policy Optimization (GRPO). We created a reward function that penalized the model for assigning "Emergency" status to cases with stable clinical indicators, forcing it to develop deeper medical reasoning.
- Downloads last month
- 6
Hardware compatibility
Log In to add your hardware
4-bit
Model tree for hjogidasani/medical-triage-llama-3.1-8b
Base model
meta-llama/Llama-3.1-8B Finetuned
meta-llama/Llama-3.1-8B-Instruct