DLKVM-Summary-Llama3-1B LoRA adapterio kortelė (LT) / LoRA Adapter Card for DLKVM-Summary-Llama3-1B (EN)

Turinys / Table of contents

Adapterio informacija

Adapterio pavadinimas: BLKT-Summary-Llama3-LoRA-Adapter

Bazinis modelis: VSSA-SDSA/LT_AI_DLKVM

Architektūra: Llama3 CausalLM

Užduotis: Abstrakčiųjų santraukų generavimas

Kaip pradėti naudoti adapterį

Šį adapterį galime naudoti lietuviškų abstrakčiųjų santraukų generavime (angl. inference) su Hugging Face transformers ir peft bibliotekomis.

Aplinkos pasiruošimas

Įsidiegiame papildomas bibliotekas iš bibliotekų reikalavimo failo. Naudota: Python 3.12.12

pip install -r requirements.txt

Kodo pavyzdys

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel

MODEL_ID = "VSSA-SDSA/LT_AI_DLKVM"
LORA_ADAPTER = "VSSA-SDSA/LT_AI_DLKVM_demo"
MAX_NEW_TOKENS = 200

tekstas = "Jūsų tekstas santraukos generavimui"

tokenizer = AutoTokenizer.from_pretrained(MODEL_ID,use_fast=True)
if tokenizer.pad_token is None:
  tokenizer.pad_token = tokenizer.eos_token
tokenizer.padding_side = "left"

base_model = AutoModelForCausalLM.from_pretrained(
  MODEL_ID,
  torch_dtype=torch.bfloat16,
  device_map={"":0},
  attn_implementation="sdpa" 
)

model = PeftModel.from_pretrained(
  base_model,
  LORA_ADAPTER,
  is_trainable=False
)

model.eval()

prompt = (
    f"<|im_start|>Teksto pradžia:\n{tekstas}<|im_end|>\n"
    f"<|im_start|>Santraukos pradžia:\n"
)

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
inputs.pop("token_type_ids", None)

end_tokens = ["<|im_end|>"]
eos_ids = tokenizer(end_tokens, add_special_tokens=False).input_ids
eos_ids = [ids[0] for ids in eos_ids if len(ids) == 1]

with torch.no_grad():
  outputs = model.generate(
      **inputs,
      max_new_tokens=MAX_NEW_TOKENS,
      do_sample=False,
      repetition_penalty=2.5,
      eos_token_id = eos_ids,
      pad_token_id = tokenizer.pad_token_id,
      num_beams = 2,
      early_stopping=True
  )

generated = tokenizer.decode(outputs[0][len(inputs["input_ids"][0]):], skip_special_tokens=True).strip()
print(generated)

Flash-Attention palaikymas

Kurtas modelis palaiko flash_attention_2, tačiau, siekiant jį naudoti reikalinga įsidiegti papildomas bibliotekas.

Python 3.12

pip install flash-attn==2.7.4.post1 --no-build-isolation

Python 3.13

pip install https://github.com/Dao-AILab/flash-attention/releases/download/v2.7.4.post1/flash_attn-2.7.4.post1+cu12torch2.6cxx11abiFALSE-cp313-cp313-linux_x86_64.whl

Susidiegus biblioteką reikia atnaujinti bazinio modelio užkrovimo skriptą.

base_model = AutoModelForCausalLM.from_pretrained(
  MODEL_ID,
  torch_dtype=torch.bfloat16,
  device_map={"":0},
  attn_implementation="flash_attention_2" 
)

Naudojimo sritis

  • Abstrakčiųjų santraukų generavimas lietuviškiems tekstams
  • Taikymai: teisės, medicinos, žiniasklaidos ir informacinių technologijų temoms

Mokymo detalės

Modelio validavimui buvo naudojamas projekte „Santraukų tekstynai dirbtiniam intelektui“, Nr.02-101-K-0001 kuriamas santraukų tekstynas.

Mokymo konfigūracija

lora_settings:
  r: 64
  lora_alpha: 128
  target_modules: ["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"]
  lora_dropout: 0.05
  task_type: "CAUSAL_LM"
  use_rslora: True
training:
  per_device_train_batch_size: 4
  gradient_accumulation_steps: 16
  bf16: True
  learning_rate: 6e-5
  warmup_ratio: 0.063
  weight_decay: 0.053
  num_train_epochs: 4
  lr_scheduler_type: "cosine"
  optim: "adafactor"
  adam_epsilon: 1e-6
  max_grad_norm: 1.0

Aplinka: Hugging Face Transformers (v4.54.1)
Aparatinė įranga: 1× NVIDIA RTX A6000 ADA

Įvertinimas

Rouge-1 Rouge-2 Rouge-L BertScore Preciziškumas BertScore iškvietimas BertScore F1 BLEU
0.3230 0.1377 0.2135 0.8786 0.8683 0.8732 10.2290

Citavimas

Jei naudojate LT_AI_DLKVM_demo ar bet kurią šios saugyklos dalį tyrimuose ar versle, cituokite taip (BibTeX):

@misc{SDSA_LT-AI-DLKVM-demo_2026,
title= {{LT-AI-DLKVM-demo}: Lithuanian Llama 3 model for abstracts generation},
author = {{State Digital Solutions Agency (SDSA)}},
year = {2026},
howpublished = {\url{https://huggingface.co/VSSA-SDSA/LT_AI_DLKVM_demo}},
note = {Developed by Vytautas Magnus University (VMU), UAB Neurotechnology, UAB Tilde informacinės technologijos, MB Krilas}
}

Licencija

Autorių teisės (c) 2026 Valstybės skaitmeninių sprendimų agentūra (VSSA)

Sukurta Vytauto Didžiojo universiteto (VDU), UAB „Neurotechnology“, UAB „Tilde informacinės technologijos“, MB „Krilas“

Licencijuota pagal NewGenLTU openRAIL-M

Pastaba: Finansuojama iš Ekonomikos gaivinimo ir atsparumo didinimo priemonės „Naujos kartos Lietuva“ plano

Adapter Information

Adapter Name: DLKVM-Summary-Llama3-LoRA-Adapter

Base Model: VSSA-SDSA/LT_AI_DLKVM

Architecture: Llama3 CausalLM

Task: Abstractive summaries generation

How to Get Started with the Adapter

This adapter must be used for lithuanian abstractive sumamries generation using Hugging Face transformers and peft libraries.

Environment Setup

Installing required Python libraries. Used: Python 3.12.12

pip install -r requirements.txt

Code Snippet

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel

MODEL_ID = "VSSA-SDSA/LT_AI_DLKVM"
LORA_ADAPTER = "VSSA-SDSA/LT_AI_DLKVM_demo"
MAX_NEW_TOKENS = 200

tekstas = "Jūsų tekstas santraukos generavimui"

tokenizer = AutoTokenizer.from_pretrained(MODEL_ID,use_fast=True)
if tokenizer.pad_token is None:
  tokenizer.pad_token = tokenizer.eos_token
tokenizer.padding_side = "left"

base_model = AutoModelForCausalLM.from_pretrained(
  MODEL_ID,
  torch_dtype=torch.bfloat16,
  device_map={"":0},
  attn_implementation="sdpa" 
)

model = PeftModel.from_pretrained(
  base_model,
  LORA_ADAPTER,
  is_trainable=False
)

model.eval()

prompt = (
    f"<|im_start|>Teksto pradžia:\n{tekstas}<|im_end|>\n"
    f"<|im_start|>Santraukos pradžia:\n"
)

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
inputs.pop("token_type_ids", None)

end_tokens = ["<|im_end|>"]
eos_ids = tokenizer(end_tokens, add_special_tokens=False).input_ids
eos_ids = [ids[0] for ids in eos_ids if len(ids) == 1]

with torch.no_grad():
  outputs = model.generate(
      **inputs,
      max_new_tokens=MAX_NEW_TOKENS,
      do_sample=False,
      repetition_penalty=2.5,
      eos_token_id = eos_ids,
      pad_token_id = tokenizer.pad_token_id,
      num_beams = 2,
      early_stopping=True
  )

generated = tokenizer.decode(outputs[0][len(inputs["input_ids"][0]):], skip_special_tokens=True).strip()
print(generated)

Support of Flash-Attention

Model supports flash_attention_2, in order to use it, you need to install additional dependancies.

Python 3.12

pip install flash-attn==2.7.4.post1 --no-build-isolation

Python 3.13

pip install https://github.com/Dao-AILab/flash-attention/releases/download/v2.7.4.post1/flash_attn-2.7.4.post1+cu12torch2.6cxx11abiFALSE-cp313-cp313-linux_x86_64.whl

After installing dependancies update the base model loading script

base_model = AutoModelForCausalLM.from_pretrained(
  MODEL_ID,
  torch_dtype=torch.bfloat16,
  device_map={"":0},
  attn_implementation="flash_attention_2" 
)

Uses

  • Abstract summary generation from Lithuanian texts
  • Applications: Law, Healthcare, Information Technolagy, and News topics

Training Details

For model validation, the summary corpus being developed in the project “Summary Corpora for Artificial Intelligence,” No. 02-101-K-0001, was used.

Training Configuration

lora_settings:
  r: 64
  lora_alpha: 128
  target_modules: ["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"]
  lora_dropout: 0.05
  task_type: "CAUSAL_LM"
  use_rslora: True
training:
  per_device_train_batch_size: 4
  gradient_accumulation_steps: 16
  bf16: True
  learning_rate: 6e-5
  warmup_ratio: 0.063
  weight_decay: 0.053
  num_train_epochs: 4
  lr_scheduler_type: "cosine"
  optim: "adafactor"
  adam_epsilon: 1e-6
  max_grad_norm: 1.0

Environment: Hugging Face Transformers (v4.54.1)
Hardware: 1× NVIDIA RTX A6000 ADA

Evaluation

Rouge-1 Rouge-2 Rouge-L BertScore Precision BertScore Recall BertScore F1 BLEU
0.3230 0.1377 0.2135 0.8786 0.8683 0.8732 10.2290

Citation

If you use LT-AI-DLKVM-demo or any part of this repository in your research or deployment, please cite as follows (BibTeX):

@misc{SDSA_LT-AI-DLKVM-demo_2026,
title= {{LT-AI-DLKVM-demo}: Lithuanian Llama 3 model for abstracts generation},
author = {{State Digital Solutions Agency (SDSA)}},
year = {2026},
howpublished = {\url{https://huggingface.co/VSSA-SDSA/LT_AI_DLKVM_demo}},
note = {Developed by Vytautas Magnus University (VMU), UAB Neurotechnology, UAB Tilde informacinės technologijos, MB Krilas}
}

License

Copyright (c) 2026 State Digital Solutions Agency (SDSA)

Developed by Vytautas Magnus University (VMU), UAB Neurotechnology, UAB Tilde informacinės technologijos, MB Krilas

Licensed under NewGenLTU openRAIL-M

Notice: Funded by Economic Recovery and Resilience Facility "New Generation Lithuania" Plan

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for VSSA-SDSA/LT_AI_DLKVM_demo

Finetuned
(1)
this model

Collection including VSSA-SDSA/LT_AI_DLKVM_demo