hjogidasani's picture
Update README.md
0842965 verified
metadata
license: llama3.1
base_model: meta-llama/Llama-3.1-8B-Instruct
tags:
  - medical
  - triage
  - grpo
  - unsloth
  - gguf
pipeline_tag: text-generation
library_name: gguf

🩺 ZeroTime-Bot: Medical Triage Alignment

Problem: Standard AI models often "over-triage" (e.g., calling a stubbed toe an emergency) due to safety-bias in training data. Solution: Used GRPO (Reinforcement Learning) to align a Llama-3.1 8B model to recognize clinical nuances between Level 1 (Emergency) and Level 3 (Non-Urgent).

πŸš€ Quick Start (Local Run)

  1. Install Ollama.
  2. Download the medical_triage.gguf from my [Hugging Face Link].
  3. Run: ollama create medicalbot -f Modelfile
  4. Run: ollama run medicalbot

πŸ“Š Results: Before vs. After

Scenario Base Llama-3.1 My Aligned Model Result
Stubbed Toe Level 1 (Emergency) Level 3 (Non-Urgent) βœ… Fixed Bias
Chest Pain Level 1 (Emergency) Level 1 (Emergency) βœ… Kept Safety

πŸ› οΈ Technical Approach

Instead of standard fine-tuning (SFT), we utilized Group Relative Policy Optimization (GRPO). We created a reward function that penalized the model for assigning "Emergency" status to cases with stable clinical indicators, forcing it to develop deeper medical reasoning.