Install from WinGet (Windows)
winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf Saxo/Linkbricks-Horizon-AI-Korean-llama3-sft-dpo-8b-base:# Run inference directly in the terminal:
llama-cli -hf Saxo/Linkbricks-Horizon-AI-Korean-llama3-sft-dpo-8b-base:Use pre-built binary
# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf Saxo/Linkbricks-Horizon-AI-Korean-llama3-sft-dpo-8b-base:# Run inference directly in the terminal:
./llama-cli -hf Saxo/Linkbricks-Horizon-AI-Korean-llama3-sft-dpo-8b-base:Build from source code
git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf Saxo/Linkbricks-Horizon-AI-Korean-llama3-sft-dpo-8b-base:# Run inference directly in the terminal:
./build/bin/llama-cli -hf Saxo/Linkbricks-Horizon-AI-Korean-llama3-sft-dpo-8b-base:Use Docker
docker model run hf.co/Saxo/Linkbricks-Horizon-AI-Korean-llama3-sft-dpo-8b-base:Model Card for Model ID
AI μ λΉ λ°μ΄ν° λΆμ μ λ¬Έ κΈ°μ μΈ Linkbricksμ λ°μ΄ν°μ¬μ΄μΈν°μ€νΈμΈ μ§μ€μ±(Saxo) μ΄μ¬κ° meta-llama/Meta-Llama-3-8Bλ₯Ό λ² μ΄μ€λͺ¨λΈλ‘ GCPμμ H100-80G 8κ°λ₯Ό ν΅ν΄ SFT-DPO νλ ¨μ ν(8000 Tokens) νκΈ κΈ°λ° λͺ¨λΈ. ν ν¬λμ΄μ λ λΌλ§3λ λμΌνλ©° νκΈ VOCA νμ₯μ νμ§ μμ λ²μ μ λλ€. νκΈμ΄ 20λ§κ° μ΄μ ν¬ν¨λ νκΈμ μ© ν ν¬λμ΄μ λͺ¨λΈμ λ³λ μ°λ½ μ£ΌμκΈ° λ°λλλ€.
Dr. Yunsung Ji (Saxo), a data scientist at Linkbricks, a company specializing in AI and big data analytics, trained the meta-llama/Meta-Llama-3-8B base model on 8 H100-60Gs on GCP for 4 hours of instructional training (8000 Tokens). Accelerate, Deepspeed Zero-3 libraries were used.
www.linkbricks.com, www.linkbricks.vc
Configuration including BitsandBytes
bnb_config = BitsAndBytesConfig( load_in_4bit=True, bnb_4bit_use_double_quant=False, bnb_4bit_quant_type="nf4", bnb_4bit_compute_dtype=torch_dtype )
args = TrainingArguments( output_dir=project_name, run_name=run_name_str, overwrite_output_dir=True, num_train_epochs=20, per_device_train_batch_size=1, gradient_accumulation_steps=4, #1 gradient_checkpointing=True, optim="paged_adamw_32bit", #optim="adamw_8bit", logging_steps=10, save_steps=100, save_strategy="epoch", learning_rate=2e-4, #2e-4 weight_decay=0.01, max_grad_norm=1, #0.3 max_steps=-1, warmup_ratio=0.1, group_by_length=False, fp16 = not torch.cuda.is_bf16_supported(), bf16 = torch.cuda.is_bf16_supported(), #fp16 = True, lr_scheduler_type="cosine", #"constant", disable_tqdm=False, report_to='wandb', push_to_hub=False )
- Downloads last month
- 17
Install from brew
# Start a local OpenAI-compatible server with a web UI: llama-server -hf Saxo/Linkbricks-Horizon-AI-Korean-llama3-sft-dpo-8b-base:# Run inference directly in the terminal: llama-cli -hf Saxo/Linkbricks-Horizon-AI-Korean-llama3-sft-dpo-8b-base: