Update README.md

0842965 verified 3 months ago

1.34 kB

license: llama3.1
base_model: meta-llama/Llama-3.1-8B-Instruct
tags:
  - medical
  - triage
  - grpo
  - unsloth
  - gguf
pipeline_tag: text-generation
library_name: gguf

🩺 ZeroTime-Bot: Medical Triage Alignment

Problem: Standard AI models often "over-triage" (e.g., calling a stubbed toe an emergency) due to safety-bias in training data. Solution: Used GRPO (Reinforcement Learning) to align a Llama-3.1 8B model to recognize clinical nuances between Level 1 (Emergency) and Level 3 (Non-Urgent).

🚀 Quick Start (Local Run)

Install Ollama.
Download the medical_triage.gguf from my [Hugging Face Link].
Run: ollama create medicalbot -f Modelfile
Run: ollama run medicalbot

📊 Results: Before vs. After

Scenario	Base Llama-3.1	My Aligned Model	Result
Stubbed Toe	Level 1 (Emergency)	Level 3 (Non-Urgent)	✅ Fixed Bias
Chest Pain	Level 1 (Emergency)	Level 1 (Emergency)	✅ Kept Safety

🛠️ Technical Approach

Instead of standard fine-tuning (SFT), we utilized Group Relative Policy Optimization (GRPO). We created a reward function that penalized the model for assigning "Emergency" status to cases with stable clinical indicators, forcing it to develop deeper medical reasoning.