Instructions to use DireDreadlord/TinyStories-GPT-Neo-LoRA with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use DireDreadlord/TinyStories-GPT-Neo-LoRA with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="DireDreadlord/TinyStories-GPT-Neo-LoRA")

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("DireDreadlord/TinyStories-GPT-Neo-LoRA", dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use DireDreadlord/TinyStories-GPT-Neo-LoRA with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "DireDreadlord/TinyStories-GPT-Neo-LoRA"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "DireDreadlord/TinyStories-GPT-Neo-LoRA",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/DireDreadlord/TinyStories-GPT-Neo-LoRA

SGLang

How to use DireDreadlord/TinyStories-GPT-Neo-LoRA with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "DireDreadlord/TinyStories-GPT-Neo-LoRA" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "DireDreadlord/TinyStories-GPT-Neo-LoRA",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "DireDreadlord/TinyStories-GPT-Neo-LoRA" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "DireDreadlord/TinyStories-GPT-Neo-LoRA",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use DireDreadlord/TinyStories-GPT-Neo-LoRA with Docker Model Runner:
```
docker model run hf.co/DireDreadlord/TinyStories-GPT-Neo-LoRA
```

GPT-Neo 125M LoRA for TinyStories

This repo contains a LoRA adapter for the "EleutherAI/gpt-neo-125m" which has been trained on the karpathy/tinystories-gpt4-clean dataset for text generation.

Model Information

Base model: EleutherAI/gpt-neo-125m
Fine-tuned on: karpathy/tinystories-gpt4-clean
Training Loop: Trained for 12,000 steps on a RTX 3050 (4GB VRAM)

Training Configuration

This adapter was trained with the following settings:

LoRA rank: 16
LoRA alpha: 32
Target modules: c_attn, c_proj
Dropout: 0.05
Bias: none
Task type: CAUSAL_LM
Precision: fp16
Gradient checkpointing: enabled
Learning rate: 2e-5
Batch size: 1
Gradient accumulation: 4
Save strategy: epoch

How to Use

Load the adapter from Hugging Face Hub

from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel

base_model = "EleutherAI/gpt-neo-125m"
adapter_repo = "DireDreadlord/TinyStories-GPT-Neo-LoRA"

tokenizer = AutoTokenizer.from_pretrained(base_model)
model = AutoModelForCausalLM.from_pretrained(base_model)
model = PeftModel.from_pretrained(model, adapter_repo)

Notes

This repository stores only the LoRA adapter. The base model weights are not included here.
To use the adapter, you must load the same base model (EleutherAI/gpt-neo-125m).

Example Inference

from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel

base_model = "EleutherAI/gpt-neo-125m"
adapter_path = "./gptn125-lora-tinystories"

tokenizer = AutoTokenizer.from_pretrained(base_model)
model = AutoModelForCausalLM.from_pretrained(base_model)
model = PeftModel.from_pretrained(model, adapter_path)

text = "Once upon a time"
inputs = tokenizer(text, return_tensors="pt").input_ids
outputs = model.generate(inputs, max_length=150, do_sample=True, top_p=0.95, temperature=0.9)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Files Included

adapter_model.safetensors — LoRA adapter weights
adapter_config.json — LoRA configuration and metadata
training_args.bin — training arguments saved by Hugging Face Trainer
README.md — model card and usage instructions

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for DireDreadlord/TinyStories-GPT-Neo-LoRA

Base model

EleutherAI/gpt-neo-125m

Adapter

(99)

this model

DireDreadlord
/

TinyStories-GPT-Neo-LoRA