Instructions to use Jahirrrr/ur-own-gf with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use Jahirrrr/ur-own-gf with PEFT:
Task type is invalid.
- llama-cpp-python
How to use Jahirrrr/ur-own-gf with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="Jahirrrr/ur-own-gf", filename="ur-own-gf-f16.gguf", )
llm.create_chat_completion( messages = "No input example has been defined for this model task." )
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- llama.cpp
How to use Jahirrrr/ur-own-gf with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf Jahirrrr/ur-own-gf:Q4_K_M # Run inference directly in the terminal: llama-cli -hf Jahirrrr/ur-own-gf:Q4_K_M
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf Jahirrrr/ur-own-gf:Q4_K_M # Run inference directly in the terminal: llama-cli -hf Jahirrrr/ur-own-gf:Q4_K_M
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf Jahirrrr/ur-own-gf:Q4_K_M # Run inference directly in the terminal: ./llama-cli -hf Jahirrrr/ur-own-gf:Q4_K_M
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf Jahirrrr/ur-own-gf:Q4_K_M # Run inference directly in the terminal: ./build/bin/llama-cli -hf Jahirrrr/ur-own-gf:Q4_K_M
Use Docker
docker model run hf.co/Jahirrrr/ur-own-gf:Q4_K_M
- LM Studio
- Jan
- Ollama
How to use Jahirrrr/ur-own-gf with Ollama:
ollama run hf.co/Jahirrrr/ur-own-gf:Q4_K_M
- Unsloth Studio
How to use Jahirrrr/ur-own-gf with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Jahirrrr/ur-own-gf to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Jahirrrr/ur-own-gf to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for Jahirrrr/ur-own-gf to start chatting
- Atomic Chat new
- Docker Model Runner
How to use Jahirrrr/ur-own-gf with Docker Model Runner:
docker model run hf.co/Jahirrrr/ur-own-gf:Q4_K_M
- Lemonade
How to use Jahirrrr/ur-own-gf with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull Jahirrrr/ur-own-gf:Q4_K_M
Run and chat with the model
lemonade run user.ur-own-gf-Q4_K_M
List all available models
lemonade list
UR OWN GF is a high-fidelity roleplay model finetuned on the Ministral-3B-Instruct base model.
It has been specifically trained to simulate a Girlfriend (GF) persona, focusing on natural conversation, emotional intelligence, and engaging roleplay interactions.
This model avoids the "robotic" feel of standard assistants and is optimized for casual, affectionate, and dynamic roleplay.
๐ Model Highlights
This model was trained using Unsloth on an NVIDIA H100 (80GB) GPU with maximum quality settings to ensure the model captures the finest nuances of the dataset:
- Base Model: Ministral 3B (State-of-the-art small model, outperforms many 7B models).
- Precision: Native BFloat16 (No quantization during training for maximum intelligence).
- LoRA Rank: 32 Lower rank chosen to learn style without memorizing specific entities.
- Context Window: 4096 Tokens.
- Dataset:
Jahirrrr/gf-conversation.
๐ฆ Available GGUF Quantizations
This repository contains GGUF files compatible with LM Studio, Ollama, KoboldCPP, and llama.cpp.
| Filename | Quantization | Size | Description |
|---|---|---|---|
ur-own-gf-f16.gguf |
F16 | ~6.5 GB | Best Quality. Uncompressed weights. Recommended if you have >8GB VRAM. |
ur-own-gf-q8_0.gguf |
Q8_0 | ~3.5 GB | Near-lossless quality. Very fast processing. Recommended for 6GB+ VRAM. |
ur-own-gf-q5_k_m.gguf |
Q5_K_M | ~2.4 GB | High accuracy, balanced size. |
ur-own-gf-q4_k_m.gguf |
Q4_K_M | ~2.0 GB | Recommended for Mobile/Low VRAM. Good balance of speed and smarts. |
๐ฃ๏ธ Chat Template (Mistral)
This model uses the standard Mistral chat template.
๐ป How to Use
Option 1: LM Studio / Ollama (Easiest)
- Download the
.gguffile of your choice (e.g.,q4_k_morq8_0). - Load it into LM Studio.
- Set the System Prompt (Optional) to something like: "You are a loving and caring girlfriend."
- Start chatting!
Option 2: Python (Unsloth/Transformers)
You can run the model (GGUF or LoRA) directly in Python using Unsloth.
from unsloth import FastLanguageModel
# 1. Load the model
model, tokenizer = FastLanguageModel.from_pretrained(
model_name = "Jahirrrr/ur-own-gf",
max_seq_length = 4096,
load_in_4bit = True, # Set False for F16
)
FastLanguageModel.for_inference(model)
# 2. Prepare Message
messages = [
{"role": "user", "content": "Hi, babe!"},
]
# 3. Apply Chat Template
inputs = tokenizer.apply_chat_template(
messages,
tokenize=True,
add_generation_prompt=True,
return_tensors="pt"
).to("cuda")
# 4. Generate Response
outputs = model.generate(
input_ids=inputs,
max_new_tokens=128,
temperature=0.7,
top_p=0.9
)
# 5. Decode
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
๐ ๏ธ Training Details
This model was trained using Unsloth on high-performance hardware. The training parameters were specifically tuned to balance style adaptation (learning the "girlfriend" persona) while minimizing knowledge overfitting.
โ๏ธ Infrastructure & Environment
| Parameter | Value | Description |
|---|---|---|
| Hardware | NVIDIA H100 (80GB) | High-end datacenter GPU for maximum speed and precision. |
| Library | Unsloth | Accelerated fine-tuning library (2x faster). |
| Base Model | Ministral-3B-Instruct |
State-of-the-art small language model (2512 version). |
| Dataset | Jahirrrr/gf-conversation |
Conversational roleplay dataset. |
| Precision | BFloat16 (BF16) | Native 16-bit precision (No 4-bit quantization used during training). |
๐ง LoRA Configuration
| Parameter | Value | Notes |
|---|---|---|
| Rank (r) | 32 | Lower rank chosen to learn style without memorizing specific entities. |
| LoRA Alpha | 64 | Scaling factor (Standard 2x Rank). |
| LoRA Dropout | 0.05 | Added to increase model flexibility and prevent overfitting. |
| Target Modules | All Linear Layers | q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj. |
| Bias | None | Optimized for efficiency. |
๐ Hyperparameters
| Parameter | Value | Context |
|---|---|---|
| Context Window | 4096 Tokens | Sufficient for medium-length chat history. |
| Learning Rate | 2e-5 | Standard SFT learning rate. |
| Batch Size | 32 | Global batch size (per_device: 16 ร accumulation: 2). |
| Max Steps | 60 | Short training duration (~1.5 epochs) to avoid rote memorization of dataset names. |
| Optimizer | AdamW 8-bit | Memory-efficient optimizer. |
| Scheduler | Cosine | Smooth convergence. |
| Weight Decay | 0.01 | Regularization parameter. |
| Training Method | Responses Only | Masking applied on user prompts (Model learns to answer, not to ask). |
Model trained & quantized by Jahirrrr
- Downloads last month
- 132
4-bit
5-bit
8-bit
16-bit
Model tree for Jahirrrr/ur-own-gf
Base model
mistralai/Ministral-3-3B-Base-2512