DLKVM-Summary-Llama3-1B LoRA adapterio kortelė (LT) / LoRA Adapter Card for DLKVM-Summary-Llama3-1B (EN)
Turinys / Table of contents
- Adapterio informacija (LT) / Adapter Information (EN)
- Kaip pradėti naudoti adapterį (LT) / How to Get Started with the Adapter (EN)
- Naudojimo sritis (LT) / Uses (EN)
- Mokymo detalės (LT) / Training Details (EN)
- Įvertinimas (LT) / Evaluation (EN)
- Citavimas (LT) / Citation (EN)
- Licencija (LT) / License (EN)
Adapterio informacija
Adapterio pavadinimas: BLKT-Summary-Llama3-LoRA-Adapter
Bazinis modelis: VSSA-SDSA/LT_AI_DLKVM
Architektūra: Llama3 CausalLM
Užduotis: Abstrakčiųjų santraukų generavimas
Kaip pradėti naudoti adapterį
Šį adapterį galime naudoti lietuviškų abstrakčiųjų santraukų generavime (angl. inference) su Hugging Face transformers ir peft bibliotekomis.
Aplinkos pasiruošimas
Įsidiegiame papildomas bibliotekas iš bibliotekų reikalavimo failo. Naudota: Python 3.12.12
pip install -r requirements.txt
Kodo pavyzdys
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
MODEL_ID = "VSSA-SDSA/LT_AI_DLKVM"
LORA_ADAPTER = "VSSA-SDSA/LT_AI_DLKVM_demo"
MAX_NEW_TOKENS = 200
tekstas = "Jūsų tekstas santraukos generavimui"
tokenizer = AutoTokenizer.from_pretrained(MODEL_ID,use_fast=True)
if tokenizer.pad_token is None:
tokenizer.pad_token = tokenizer.eos_token
tokenizer.padding_side = "left"
base_model = AutoModelForCausalLM.from_pretrained(
MODEL_ID,
torch_dtype=torch.bfloat16,
device_map={"":0},
attn_implementation="sdpa"
)
model = PeftModel.from_pretrained(
base_model,
LORA_ADAPTER,
is_trainable=False
)
model.eval()
prompt = (
f"<|im_start|>Teksto pradžia:\n{tekstas}<|im_end|>\n"
f"<|im_start|>Santraukos pradžia:\n"
)
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
inputs.pop("token_type_ids", None)
end_tokens = ["<|im_end|>"]
eos_ids = tokenizer(end_tokens, add_special_tokens=False).input_ids
eos_ids = [ids[0] for ids in eos_ids if len(ids) == 1]
with torch.no_grad():
outputs = model.generate(
**inputs,
max_new_tokens=MAX_NEW_TOKENS,
do_sample=False,
repetition_penalty=2.5,
eos_token_id = eos_ids,
pad_token_id = tokenizer.pad_token_id,
num_beams = 2,
early_stopping=True
)
generated = tokenizer.decode(outputs[0][len(inputs["input_ids"][0]):], skip_special_tokens=True).strip()
print(generated)
Flash-Attention palaikymas
Kurtas modelis palaiko flash_attention_2, tačiau, siekiant jį naudoti reikalinga įsidiegti papildomas bibliotekas.
Python 3.12
pip install flash-attn==2.7.4.post1 --no-build-isolation
Python 3.13
pip install https://github.com/Dao-AILab/flash-attention/releases/download/v2.7.4.post1/flash_attn-2.7.4.post1+cu12torch2.6cxx11abiFALSE-cp313-cp313-linux_x86_64.whl
Susidiegus biblioteką reikia atnaujinti bazinio modelio užkrovimo skriptą.
base_model = AutoModelForCausalLM.from_pretrained(
MODEL_ID,
torch_dtype=torch.bfloat16,
device_map={"":0},
attn_implementation="flash_attention_2"
)
Naudojimo sritis
- Abstrakčiųjų santraukų generavimas lietuviškiems tekstams
- Taikymai: teisės, medicinos, žiniasklaidos ir informacinių technologijų temoms
Mokymo detalės
Modelio validavimui buvo naudojamas projekte „Santraukų tekstynai dirbtiniam intelektui“, Nr.02-101-K-0001 kuriamas santraukų tekstynas.
Mokymo konfigūracija
lora_settings:
r: 64
lora_alpha: 128
target_modules: ["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"]
lora_dropout: 0.05
task_type: "CAUSAL_LM"
use_rslora: True
training:
per_device_train_batch_size: 4
gradient_accumulation_steps: 16
bf16: True
learning_rate: 6e-5
warmup_ratio: 0.063
weight_decay: 0.053
num_train_epochs: 4
lr_scheduler_type: "cosine"
optim: "adafactor"
adam_epsilon: 1e-6
max_grad_norm: 1.0
Aplinka: Hugging Face Transformers (v4.54.1)
Aparatinė įranga: 1× NVIDIA RTX A6000 ADA
Įvertinimas
| Rouge-1 | Rouge-2 | Rouge-L | BertScore Preciziškumas | BertScore iškvietimas | BertScore F1 | BLEU |
|---|---|---|---|---|---|---|
| 0.3230 | 0.1377 | 0.2135 | 0.8786 | 0.8683 | 0.8732 | 10.2290 |
Citavimas
Jei naudojate LT_AI_DLKVM_demo ar bet kurią šios saugyklos dalį tyrimuose ar versle, cituokite taip (BibTeX):
@misc{SDSA_LT-AI-DLKVM-demo_2026,
title= {{LT-AI-DLKVM-demo}: Lithuanian Llama 3 model for abstracts generation},
author = {{State Digital Solutions Agency (SDSA)}},
year = {2026},
howpublished = {\url{https://huggingface.co/VSSA-SDSA/LT_AI_DLKVM_demo}},
note = {Developed by Vytautas Magnus University (VMU), UAB Neurotechnology, UAB Tilde informacinės technologijos, MB Krilas}
}
Licencija
Autorių teisės (c) 2026 Valstybės skaitmeninių sprendimų agentūra (VSSA)
Sukurta Vytauto Didžiojo universiteto (VDU), UAB „Neurotechnology“, UAB „Tilde informacinės technologijos“, MB „Krilas“
Licencijuota pagal NewGenLTU openRAIL-M
Pastaba: Finansuojama iš Ekonomikos gaivinimo ir atsparumo didinimo priemonės „Naujos kartos Lietuva“ plano
Adapter Information
Adapter Name: DLKVM-Summary-Llama3-LoRA-Adapter
Base Model: VSSA-SDSA/LT_AI_DLKVM
Architecture: Llama3 CausalLM
Task: Abstractive summaries generation
How to Get Started with the Adapter
This adapter must be used for lithuanian abstractive sumamries generation using Hugging Face transformers and peft libraries.
Environment Setup
Installing required Python libraries. Used: Python 3.12.12
pip install -r requirements.txt
Code Snippet
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
MODEL_ID = "VSSA-SDSA/LT_AI_DLKVM"
LORA_ADAPTER = "VSSA-SDSA/LT_AI_DLKVM_demo"
MAX_NEW_TOKENS = 200
tekstas = "Jūsų tekstas santraukos generavimui"
tokenizer = AutoTokenizer.from_pretrained(MODEL_ID,use_fast=True)
if tokenizer.pad_token is None:
tokenizer.pad_token = tokenizer.eos_token
tokenizer.padding_side = "left"
base_model = AutoModelForCausalLM.from_pretrained(
MODEL_ID,
torch_dtype=torch.bfloat16,
device_map={"":0},
attn_implementation="sdpa"
)
model = PeftModel.from_pretrained(
base_model,
LORA_ADAPTER,
is_trainable=False
)
model.eval()
prompt = (
f"<|im_start|>Teksto pradžia:\n{tekstas}<|im_end|>\n"
f"<|im_start|>Santraukos pradžia:\n"
)
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
inputs.pop("token_type_ids", None)
end_tokens = ["<|im_end|>"]
eos_ids = tokenizer(end_tokens, add_special_tokens=False).input_ids
eos_ids = [ids[0] for ids in eos_ids if len(ids) == 1]
with torch.no_grad():
outputs = model.generate(
**inputs,
max_new_tokens=MAX_NEW_TOKENS,
do_sample=False,
repetition_penalty=2.5,
eos_token_id = eos_ids,
pad_token_id = tokenizer.pad_token_id,
num_beams = 2,
early_stopping=True
)
generated = tokenizer.decode(outputs[0][len(inputs["input_ids"][0]):], skip_special_tokens=True).strip()
print(generated)
Support of Flash-Attention
Model supports flash_attention_2, in order to use it, you need to install additional dependancies.
Python 3.12
pip install flash-attn==2.7.4.post1 --no-build-isolation
Python 3.13
pip install https://github.com/Dao-AILab/flash-attention/releases/download/v2.7.4.post1/flash_attn-2.7.4.post1+cu12torch2.6cxx11abiFALSE-cp313-cp313-linux_x86_64.whl
After installing dependancies update the base model loading script
base_model = AutoModelForCausalLM.from_pretrained(
MODEL_ID,
torch_dtype=torch.bfloat16,
device_map={"":0},
attn_implementation="flash_attention_2"
)
Uses
- Abstract summary generation from Lithuanian texts
- Applications: Law, Healthcare, Information Technolagy, and News topics
Training Details
For model validation, the summary corpus being developed in the project “Summary Corpora for Artificial Intelligence,” No. 02-101-K-0001, was used.
Training Configuration
lora_settings:
r: 64
lora_alpha: 128
target_modules: ["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"]
lora_dropout: 0.05
task_type: "CAUSAL_LM"
use_rslora: True
training:
per_device_train_batch_size: 4
gradient_accumulation_steps: 16
bf16: True
learning_rate: 6e-5
warmup_ratio: 0.063
weight_decay: 0.053
num_train_epochs: 4
lr_scheduler_type: "cosine"
optim: "adafactor"
adam_epsilon: 1e-6
max_grad_norm: 1.0
Environment: Hugging Face Transformers (v4.54.1)
Hardware: 1× NVIDIA RTX A6000 ADA
Evaluation
| Rouge-1 | Rouge-2 | Rouge-L | BertScore Precision | BertScore Recall | BertScore F1 | BLEU |
|---|---|---|---|---|---|---|
| 0.3230 | 0.1377 | 0.2135 | 0.8786 | 0.8683 | 0.8732 | 10.2290 |
Citation
If you use LT-AI-DLKVM-demo or any part of this repository in your research or deployment, please cite as follows (BibTeX):
@misc{SDSA_LT-AI-DLKVM-demo_2026,
title= {{LT-AI-DLKVM-demo}: Lithuanian Llama 3 model for abstracts generation},
author = {{State Digital Solutions Agency (SDSA)}},
year = {2026},
howpublished = {\url{https://huggingface.co/VSSA-SDSA/LT_AI_DLKVM_demo}},
note = {Developed by Vytautas Magnus University (VMU), UAB Neurotechnology, UAB Tilde informacinės technologijos, MB Krilas}
}
License
Copyright (c) 2026 State Digital Solutions Agency (SDSA)
Developed by Vytautas Magnus University (VMU), UAB Neurotechnology, UAB Tilde informacinės technologijos, MB Krilas
Licensed under NewGenLTU openRAIL-M
Notice: Funded by Economic Recovery and Resilience Facility "New Generation Lithuania" Plan
Model tree for VSSA-SDSA/LT_AI_DLKVM_demo
Base model
VSSA-SDSA/LT_AI_DLKVM