Instructions to use Saxo/Linkbricks-Horizon-AI-Korean-llama3-sft-dpo-8b-base with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Saxo/Linkbricks-Horizon-AI-Korean-llama3-sft-dpo-8b-base with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Saxo/Linkbricks-Horizon-AI-Korean-llama3-sft-dpo-8b-base")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Saxo/Linkbricks-Horizon-AI-Korean-llama3-sft-dpo-8b-base")
model = AutoModelForCausalLM.from_pretrained("Saxo/Linkbricks-Horizon-AI-Korean-llama3-sft-dpo-8b-base")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

llama-cpp-python

How to use Saxo/Linkbricks-Horizon-AI-Korean-llama3-sft-dpo-8b-base with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="Saxo/Linkbricks-Horizon-AI-Korean-llama3-sft-dpo-8b-base",
	filename="ggml-model-bf16.gguf",
)

llm.create_chat_completion(
	messages = [
		{
			"role": "user",
			"content": "What is the capital of France?"
		}
	]
)

Inference
Notebooks
Google Colab
Kaggle
Local Apps

llama.cpp

How to use Saxo/Linkbricks-Horizon-AI-Korean-llama3-sft-dpo-8b-base with llama.cpp:

Install from brew

brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf Saxo/Linkbricks-Horizon-AI-Korean-llama3-sft-dpo-8b-base:BF16
# Run inference directly in the terminal:
llama-cli -hf Saxo/Linkbricks-Horizon-AI-Korean-llama3-sft-dpo-8b-base:BF16

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf Saxo/Linkbricks-Horizon-AI-Korean-llama3-sft-dpo-8b-base:BF16
# Run inference directly in the terminal:
llama-cli -hf Saxo/Linkbricks-Horizon-AI-Korean-llama3-sft-dpo-8b-base:BF16

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf Saxo/Linkbricks-Horizon-AI-Korean-llama3-sft-dpo-8b-base:BF16
# Run inference directly in the terminal:
./llama-cli -hf Saxo/Linkbricks-Horizon-AI-Korean-llama3-sft-dpo-8b-base:BF16

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf Saxo/Linkbricks-Horizon-AI-Korean-llama3-sft-dpo-8b-base:BF16
# Run inference directly in the terminal:
./build/bin/llama-cli -hf Saxo/Linkbricks-Horizon-AI-Korean-llama3-sft-dpo-8b-base:BF16

Use Docker

docker model run hf.co/Saxo/Linkbricks-Horizon-AI-Korean-llama3-sft-dpo-8b-base:BF16

LM Studio
Jan

vLLM

How to use Saxo/Linkbricks-Horizon-AI-Korean-llama3-sft-dpo-8b-base with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Saxo/Linkbricks-Horizon-AI-Korean-llama3-sft-dpo-8b-base"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Saxo/Linkbricks-Horizon-AI-Korean-llama3-sft-dpo-8b-base",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/Saxo/Linkbricks-Horizon-AI-Korean-llama3-sft-dpo-8b-base:BF16

SGLang

How to use Saxo/Linkbricks-Horizon-AI-Korean-llama3-sft-dpo-8b-base with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Saxo/Linkbricks-Horizon-AI-Korean-llama3-sft-dpo-8b-base" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Saxo/Linkbricks-Horizon-AI-Korean-llama3-sft-dpo-8b-base",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Saxo/Linkbricks-Horizon-AI-Korean-llama3-sft-dpo-8b-base" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Saxo/Linkbricks-Horizon-AI-Korean-llama3-sft-dpo-8b-base",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Ollama
How to use Saxo/Linkbricks-Horizon-AI-Korean-llama3-sft-dpo-8b-base with Ollama:
```
ollama run hf.co/Saxo/Linkbricks-Horizon-AI-Korean-llama3-sft-dpo-8b-base:BF16
```

Unsloth Studio new

How to use Saxo/Linkbricks-Horizon-AI-Korean-llama3-sft-dpo-8b-base with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for Saxo/Linkbricks-Horizon-AI-Korean-llama3-sft-dpo-8b-base to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for Saxo/Linkbricks-Horizon-AI-Korean-llama3-sft-dpo-8b-base to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for Saxo/Linkbricks-Horizon-AI-Korean-llama3-sft-dpo-8b-base to start chatting

Docker Model Runner
How to use Saxo/Linkbricks-Horizon-AI-Korean-llama3-sft-dpo-8b-base with Docker Model Runner:
```
docker model run hf.co/Saxo/Linkbricks-Horizon-AI-Korean-llama3-sft-dpo-8b-base:BF16
```

Lemonade

How to use Saxo/Linkbricks-Horizon-AI-Korean-llama3-sft-dpo-8b-base with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull Saxo/Linkbricks-Horizon-AI-Korean-llama3-sft-dpo-8b-base:BF16

Run and chat with the model

lemonade run user.Linkbricks-Horizon-AI-Korean-llama3-sft-dpo-8b-base-BF16

List all available models

lemonade list

Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

Model Card for Model ID

AI 와 빅데이터 분석 전문 기업인 Linkbricks의 데이터사이언티스트인 지윤성(Saxo) 이사가 meta-llama/Meta-Llama-3-8B를 베이스모델로 GCP상의 H100-80G 8개를 통해 SFT-DPO 훈련을 한(8000 Tokens) 한글 기반 모델. 토크나이저는 라마3랑 동일하며 한글 VOCA 확장은 하지 않은 버전 입니다. 한글이 20만개 이상 포함된 한글전용 토크나이저 모델은 별도 연락 주시기 바랍니다.

Dr. Yunsung Ji (Saxo), a data scientist at Linkbricks, a company specializing in AI and big data analytics, trained the meta-llama/Meta-Llama-3-8B base model on 8 H100-60Gs on GCP for 4 hours of instructional training (8000 Tokens). Accelerate, Deepspeed Zero-3 libraries were used.

www.linkbricks.com, www.linkbricks.vc

Configuration including BitsandBytes

bnb_config = BitsAndBytesConfig( load_in_4bit=True, bnb_4bit_use_double_quant=False, bnb_4bit_quant_type="nf4", bnb_4bit_compute_dtype=torch_dtype )

args = TrainingArguments( output_dir=project_name, run_name=run_name_str, overwrite_output_dir=True, num_train_epochs=20, per_device_train_batch_size=1, gradient_accumulation_steps=4, #1 gradient_checkpointing=True, optim="paged_adamw_32bit", #optim="adamw_8bit", logging_steps=10, save_steps=100, save_strategy="epoch", learning_rate=2e-4, #2e-4 weight_decay=0.01, max_grad_norm=1, #0.3 max_steps=-1, warmup_ratio=0.1, group_by_length=False, fp16 = not torch.cuda.is_bf16_supported(), bf16 = torch.cuda.is_bf16_supported(), #fp16 = True, lr_scheduler_type="cosine", #"constant", disable_tqdm=False, report_to='wandb', push_to_hub=False )

Downloads last month: 17

Safetensors

Model size

8B params

Tensor type

BF16

Model tree for Saxo/Linkbricks-Horizon-AI-Korean-llama3-sft-dpo-8b-base

Merges

1 model

Quantizations

4 models

Saxo
/

Linkbricks-Horizon-AI-Korean-llama3-sft-dpo-8b-base

Model Card for Model ID

Configuration including BitsandBytes

Model tree for Saxo/Linkbricks-Horizon-AI-Korean-llama3-sft-dpo-8b-base

Dataset used to train Saxo/Linkbricks-Horizon-AI-Korean-llama3-sft-dpo-8b-base

Spaces using Saxo/Linkbricks-Horizon-AI-Korean-llama3-sft-dpo-8b-base 9