iimran
/

Qwen2.5-3B-R1-MedicalReasoner-lora-adapter

@@ -1,100 +1,112 @@
----
-datasets:
-- iimran/Medical-Intelligence-Questions
-base_model:
-- Qwen/Qwen2.5-3B
-language:
-- en
-tags:
-- medical
-- text-generation-inference
-- transformers
-- unsloth
----
-# Qwen2.5-3B-R1-MedicalReasoner LoRA Adapter
-This repository contains the LoRA adapter weights and configuration for **Qwen2.5-3B-R1-MedicalReasoner**, a state-of-the-art clinical reasoning language model fine-tuned using GRPO. The adapter is designed to further optimize and customize model behavior for clinical reasoning tasks.
-## Overview
-- **Adapter Name:** Qwen2.5-3B-R1-MedicalReasoner LoRA Adapter
-- **Purpose:** To modify and enhance the base model (Qwen2.5-3B-R1-MedicalReasoner) using Low-Rank Adaptation (LoRA) techniques without modifying the full model weights.
-- **Use Case:** Ideal for users wishing to fine-tune, experiment, or deploy the clinical reasoning model with customized parameter-efficient adaptations.
-## Key Features
-- **Parameter-Efficient Adaptation:** LoRA allows for training a small number of additional parameters, making further fine-tuning efficient in time and resources.
-- **Seamless Integration:** Easily integrated with the base model using the provided tools and functions in Unsloth and vLLM.
-- **Optimized for Clinical Reasoning:** The adapter reinforces chain-of-thought generation and improves the clarity of diagnostic reasoning outputs.
-## How to Use
-### Integration with Base Model
-To download and load the LoRA adapter into Qwen2.5-3B-R1-MedicalReasoner:
-```python
-from huggingface_hub import snapshot_download
-from unsloth import FastLanguageModel
-# Download the adapter weights:
-lora_path = snapshot_download("iimran/Qwen2.5-3B-R1-MedicalReasoner-lora-adapter")
-print("LoRA adapter downloaded to:", lora_path)
-# Load base model:
-model, tokenizer = FastLanguageModel.from_pretrained(
-    model_name="iimran/Qwen2.5-3B-R1-MedicalReasoner",
-    load_in_4bit=False,
-    fast_inference=True
-)
-# Load the LoRA adapter:
-model.load_lora(lora_path)
-```
-## Fine-Tuning and Experimentation
-This adapter was originally developed and fine-tuned using GRPO with customized reward functions to enhance chain-of-thought reasoning. Researchers who wish to further optimize the behavior of the clinical reasoning model with targeted adaptations can start from these adapter weights.
-## Installation Requirements
-* **Python Version:** 3.8 or higher
-* **Dependencies:**
-   * unsloth
-   * vLLM
-   * huggingface-hub
-   * Other dependencies required by the base model and LoRA integration
-Install the required packages using pip:
-```bash
-pip install unsloth vllm huggingface-hub
-```
-## Citation
-If you use the LoRA adapter in your work, please cite:
-```bibtex
-@misc{Qwen2.5-3B-R1-MedicalReasoner-lora-adapter,
-  authors = {Imran Sarwar, Muhammad Rouf Mustafa},
-  title = {Qwen2.5-3B-R1-MedicalReasoner LoRA Adapter},
-  year = {2025},
-  publisher = {Hugging Face},
-  url = {https://huggingface.co/iimran/Qwen2.5-3B-R1-MedicalReasoner-lora-adapter}
-}
-```
-## Contributing
-Contributions to the LoRA adapter are welcome. If you have improvements for:
-* Adapter performance or efficiency
-* Documentation updates
-* Additional experiments or fine-tuning strategies
-Please open an issue or submit a pull request.
-## Disclaimer
 This LoRA adapter is provided for research and educational purposes. It is intended to be used in combination with the **Qwen2.5-3B-R1-MedicalReasoner** base model. As with the base model, clinical outputs should be validated by qualified healthcare professionals before being used in any medical decision-making.

+---
+datasets:
+- iimran/Medical-Intelligence-Questions
+base_model:
+- Qwen/Qwen2.5-3B
+language:
+- zho
+- eng
+- fra
+- spa
+- por
+- deu
+- ita
+- rus
+- jpn
+- kor
+- vie
+- tha
+- ara
+tags:
+- medical
+- text-generation-inference
+- transformers
+- unsloth
+---
+# Qwen2.5-3B-R1-MedicalReasoner LoRA Adapter
+This repository contains the LoRA adapter weights and configuration for **Qwen2.5-3B-R1-MedicalReasoner**, a state-of-the-art clinical reasoning language model fine-tuned using GRPO. The adapter is designed to further optimize and customize model behavior for clinical reasoning tasks.
+## Overview
+- **Adapter Name:** Qwen2.5-3B-R1-MedicalReasoner LoRA Adapter
+- **Purpose:** To modify and enhance the base model (Qwen2.5-3B-R1-MedicalReasoner) using Low-Rank Adaptation (LoRA) techniques without modifying the full model weights.
+- **Use Case:** Ideal for users wishing to fine-tune, experiment, or deploy the clinical reasoning model with customized parameter-efficient adaptations.
+## Key Features
+- **Parameter-Efficient Adaptation:** LoRA allows for training a small number of additional parameters, making further fine-tuning efficient in time and resources.
+- **Seamless Integration:** Easily integrated with the base model using the provided tools and functions in Unsloth and vLLM.
+- **Optimized for Clinical Reasoning:** The adapter reinforces chain-of-thought generation and improves the clarity of diagnostic reasoning outputs.
+## How to Use
+### Integration with Base Model
+To download and load the LoRA adapter into Qwen2.5-3B-R1-MedicalReasoner:
+```python
+from huggingface_hub import snapshot_download
+from unsloth import FastLanguageModel
+# Download the adapter weights:
+lora_path = snapshot_download("iimran/Qwen2.5-3B-R1-MedicalReasoner-lora-adapter")
+print("LoRA adapter downloaded to:", lora_path)
+# Load base model:
+model, tokenizer = FastLanguageModel.from_pretrained(
+    model_name="iimran/Qwen2.5-3B-R1-MedicalReasoner",
+    load_in_4bit=False,
+    fast_inference=True
+)
+# Load the LoRA adapter:
+model.load_lora(lora_path)
+```
+## Fine-Tuning and Experimentation
+This adapter was originally developed and fine-tuned using GRPO with customized reward functions to enhance chain-of-thought reasoning. Researchers who wish to further optimize the behavior of the clinical reasoning model with targeted adaptations can start from these adapter weights.
+## Installation Requirements
+* **Python Version:** 3.8 or higher
+* **Dependencies:**
+   * unsloth
+   * vLLM
+   * huggingface-hub
+   * Other dependencies required by the base model and LoRA integration
+Install the required packages using pip:
+```bash
+pip install unsloth vllm huggingface-hub
+```
+## Citation
+If you use the LoRA adapter in your work, please cite:
+```bibtex
+@misc{Qwen2.5-3B-R1-MedicalReasoner-lora-adapter,
+  authors = {Imran Sarwar, Muhammad Rouf Mustafa},
+  title = {Qwen2.5-3B-R1-MedicalReasoner LoRA Adapter},
+  year = {2025},
+  publisher = {Hugging Face},
+  url = {https://huggingface.co/iimran/Qwen2.5-3B-R1-MedicalReasoner-lora-adapter}
+}
+```
+## Contributing
+Contributions to the LoRA adapter are welcome. If you have improvements for:
+* Adapter performance or efficiency
+* Documentation updates
+* Additional experiments or fine-tuning strategies
+Please open an issue or submit a pull request.
+## Disclaimer
 This LoRA adapter is provided for research and educational purposes. It is intended to be used in combination with the **Qwen2.5-3B-R1-MedicalReasoner** base model. As with the base model, clinical outputs should be validated by qualified healthcare professionals before being used in any medical decision-making.