Instructions to use JusWis/kanana-legal-f16 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- llama-cpp-python
How to use JusWis/kanana-legal-f16 with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="JusWis/kanana-legal-f16", filename="kanana-legal-f16.gguf", )
llm.create_chat_completion( messages = "No input example has been defined for this model task." )
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- llama.cpp
How to use JusWis/kanana-legal-f16 with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf JusWis/kanana-legal-f16:F16 # Run inference directly in the terminal: llama-cli -hf JusWis/kanana-legal-f16:F16
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf JusWis/kanana-legal-f16:F16 # Run inference directly in the terminal: llama-cli -hf JusWis/kanana-legal-f16:F16
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf JusWis/kanana-legal-f16:F16 # Run inference directly in the terminal: ./llama-cli -hf JusWis/kanana-legal-f16:F16
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf JusWis/kanana-legal-f16:F16 # Run inference directly in the terminal: ./build/bin/llama-cli -hf JusWis/kanana-legal-f16:F16
Use Docker
docker model run hf.co/JusWis/kanana-legal-f16:F16
- LM Studio
- Jan
- Ollama
How to use JusWis/kanana-legal-f16 with Ollama:
ollama run hf.co/JusWis/kanana-legal-f16:F16
- Unsloth Studio
How to use JusWis/kanana-legal-f16 with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for JusWis/kanana-legal-f16 to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for JusWis/kanana-legal-f16 to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for JusWis/kanana-legal-f16 to start chatting
- Atomic Chat new
- Docker Model Runner
How to use JusWis/kanana-legal-f16 with Docker Model Runner:
docker model run hf.co/JusWis/kanana-legal-f16:F16
- Lemonade
How to use JusWis/kanana-legal-f16 with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull JusWis/kanana-legal-f16:F16
Run and chat with the model
lemonade run user.kanana-legal-f16-F16
List all available models
lemonade list
Kanana Legal FP16 GGUF
This is a FP16 GGUF quantized version of the Korean Legal fine-tuned model based on Kakao's Kanana Nano 2.1B Instruct.
π― Model Overview
This ultra-lightweight 2.1B parameter model has been fine-tuned on Korean legal terminology to run efficiently on CPU-only environments without internet connectivity, making it ideal for offline legal document processing in restricted environments.
Key Features
- β CPU-Optimized: Designed for CPU inference without GPU requirements
- β Offline Capable: No internet connection needed for inference
- β Legal Domain: Fine-tuned on 17,484 Korean legal term definitions
- β GGUF Format: Compatible with llama.cpp and other GGUF-compatible tools
- β FP16 Precision: Full 16-bit floating point precision for maximum accuracy
π Training Data
The model was fine-tuned on the Korean Legal Terminology dataset:
- Samples: 17,484 legal term definitions
- Format: Instruction-following (input/output pairs)
- Domain: Korean legal terminology, concepts, and definitions
- Language: Korean
π§ Fine-tuning Details
Base Model
- Model: kakaocorp/kanana-nano-2.1b-instruct
- Size: 2.1B parameters
- Architecture: Transformer-based causal language model
Training Configuration
QLoRA Settings:
- Quantization Type: NF4 (Normal Float 4-bit)
- Compute dtype: bfloat16
- Double Quantization: Enabled
- LoRA Rank (r): 16
- LoRA Alpha: 32
- Target Modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
- LoRA Dropout: 0.05
- Trainable Parameters: ~23M (1.95% of total)
Training Hyperparameters:
- Epochs: 3
- Batch Size: 4 (per device)
- Gradient Accumulation Steps: 4
- Effective Batch Size: 16
- Optimizer: Paged AdamW 8-bit
- Learning Rate: 2e-4
- LR Scheduler: Cosine
- Warmup Ratio: 0.03
- Weight Decay: 0.01
- Max Gradient Norm: 0.3
- Mixed Precision: bfloat16
- Max Sequence Length: 2048
Optimization Techniques
- Gradient Checkpointing: Reduces memory during backpropagation
- Paged Optimizers: Efficient memory management for optimizer states
- Mixed Precision: bfloat16 for faster computation
- Gradient Accumulation: Simulates larger batch sizes
π» Usage
Requirements
# Install llama.cpp or compatible GGUF runtime
git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp
make
Inference Example
# Run with llama.cpp
./main -m kanana-legal-f16.gguf -p "### μ§λ¬Έ:\nλ€μ λ²λ₯ μ©μ΄λ₯Ό μ€λͺ
ν΄μ€: ν‘μν©λ³\n\n### λ΅λ³:" -n 512
# English: "Question: Explain the following legal term: Merger by Absorption"
Python Usage (with llama-cpp-python)
from llama_cpp import Llama
# Load model
llm = Llama(
model_path="kanana-legal-f16.gguf",
n_ctx=2048,
n_threads=8,
)
# Generate response
prompt = """### μ§λ¬Έ:
λ€μ λ²λ₯ μ©μ΄λ₯Ό μ€λͺ
ν΄μ€: ν‘μν©λ³
### λ΅λ³:"""
# English: "Question: Explain the following legal term: Merger by Absorption / Answer:"
output = llm(
prompt,
max_tokens=512,
temperature=0.7,
top_p=0.9,
)
print(output['choices'][0]['text'])
π Prompt Format
The model expects the following instruction format:
### μ§λ¬Έ:
λ€μ λ²λ₯ μ©μ΄(νμ: εΈζΆεε)λ₯Ό μ€λͺ
ν΄μ€: ν‘μν©λ³
### λ΅λ³:
λ²λ₯ μ΄ μ νλ μ μ°¨μ μνμ¬ 2 μ΄μμ λ²μΈ μ λΆ λλ κ·Έμ€ 1κ°μ λ²μΈμ΄μΈμ λ²μΈμ΄ ν΄μ°νμ¬...
English Translation:
### Question:
Explain the following legal term (Hanja: εΈζΆεε): Merger by Absorption
### Answer:
According to the procedures prescribed by law, when two or more corporations, or all corporations except one, are dissolved...
β οΈ Disclaimer
IMPORTANT: This is an ultra-lightweight 2.1B parameter model designed for CPU-only inference in offline environments. Due to its compact size and training constraints:
- β Legal information may not be accurate or complete
- β Should NOT be used as a substitute for professional legal advice
- β No warranty or liability is provided for any use of this model
- β Users assume all responsibility for verifying information
- β οΈ For informational and educational purposes only
Always consult qualified legal professionals for actual legal matters.
π Model Variants
This model is available in multiple quantization levels:
| Model | Size | Precision | Use Case |
|---|---|---|---|
| kanana-legal-f16 | ~4.2GB | FP16 | Maximum accuracy, higher memory |
| kanana-legal-q8_0 | ~2.2GB | Q8_0 | Balanced accuracy/size |
| kanana-legal-q4_k_m | ~1.4GB | Q4_K_M | Smallest size, fastest inference |
π License
Apache 2.0 License - This model inherits the license from the base Kanana model.
π Acknowledgments
- Base Model: Kakao Corp for Kanana Nano 2.1B Instruct
- Dataset: Korean Legal Terminology
- QLoRA: Tim Dettmers et al. for the QLoRA method
- Libraries: HuggingFace Transformers, PEFT, TRL, bitsandbytes
- GGUF: llama.cpp for GGUF format
π Related Resources
- Training Code: ko_legal_finetune
- Training Dataset: korean-legal-terminology
- Base Model: kanana-nano-2.1b-instruct
π§ Contact
For questions or issues, please visit the GitHub repository.
- Downloads last month
- 3
16-bit
Model tree for JusWis/kanana-legal-f16
Base model
kakaocorp/kanana-nano-2.1b-instruct