license: gemma
base_model: google/gemma-2-9b-it
datasets:
- gcelikmasat-work/BPMN-IT-Dataset
language:
- en
library_name: transformers
pipeline_tag: text-generation
tags:
- bpmn
- business-process-modeling
- process-modeling
- instruction-tuning
- lora
- peft
- dot
- graphviz
- llama-factory
- gemma2
model-index:
- name: Gemma2-9B-BPMG-IT
results:
- task:
type: text-generation
name: BPMN Model Generation from Text
dataset:
type: gcelikmasat-work/BPMN-IT-Dataset
name: >-
BPMN-IT (stratified 180-instance benchmark across 15 business
domains, seed split)
metrics:
- type: bleu
value: 82.98
name: BLEU
- type: rouge
value: 94.61
name: ROUGE-L
- type: meteor
value: 92.67
name: METEOR
- type: relative-graph-edit-distance
value: 97.78
name: R-GED Accuracy (%)
Gemma2-9B-BPMG-IT
Gemma2-9B-BPMG-IT is an instruction-tuned language model that converts natural-language business process descriptions into BPMN models rendered in Graphviz DOT. It is a LoRA adaptation of google/gemma-2-9b-it, trained on a cleaned subset of the MaD dataset for the paper:
Generating Business Process Models with Open Source Large Language Models using Instruction Tuning. Gökberk Çelikmasat, Atay Özgövde, Fatma Başak Aydemir. International Conference on Product-Focused Software Process Improvement (PROFES 2025), Springer LNCS, pp. 269–284. DOI: 10.1007/978-3-032-12089-2_17
This model is the subject of our PROFES 2025 conference paper, in which we introduced the instruction-tuning approach and evaluated it against open-weight and proprietary baselines on textual and structural metrics. It is also the BPMG-IT baseline in our subsequent journal paper:
InstruBPM: Instruction-Tuning Open-Weight Language Models for BPMN Model Generation. Çelikmasat, Özgövde, Aydemir. Software and Systems Modeling, under review, 2026. arXiv: 2512.12063
For new projects we recommend the successor model, gcelikmasat-work/Qwen3_4B_BPMN_IT, which matches this model's accuracy with roughly half the parameter count (4B vs. 9B) and ships with quantized and merge-scale variants for deployment trade-offs.
Results
Evaluated on the 180-instance stratified benchmark used in the InstruBPM journal paper (Table 2), this model attains:
| Metric | Score |
|---|---|
| BLEU | 82.98 |
| ROUGE-L | 94.61 |
| METEOR | 92.67 |
| R-GED Acc. | 97.78 |
These scores are very close to those of the newer 4B Qwen3 successor (which reaches 83.06 / 94.43 / 92.82 / 99.44 on the same benchmark), while requiring more than twice the memory and compute at inference time.
Intended use
Generate first-draft BPMN models from textual process descriptions to accelerate early-stage modeling. Intended as an assistant for business process modelers and analysts; human review remains recommended, particularly for gateway logic and activity labels.
Supported BPMN subset
The model generates BPMN process fragments in DOT notation covering: start events, end events, tasks (activities), sequence flows, and AND/XOR gateways (splits and joins). It does not generate pools, lanes, message flows, data objects, intermediate/boundary events, sub-processes, or annotations.
How to use
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model_id = "gcelikmasat-work/gemma-2-9b-it-BPMN"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id, torch_dtype=torch.bfloat16, device_map="auto"
)
instruction = (
"You are an expert in BPMN modeling and DOT language. Your task is to "
"convert detailed textual descriptions of business processes into accurate "
"BPMN model codes written in DOT language. Label all nodes with their "
"activity names. Represent all connections between nodes without labeling "
"the connections. Represent each node and its connections accurately, "
"ensuring all decision points and flows are included and connected. "
"Now, generate BPMN business process model code in DOT language for the "
"following textual description of a business process: "
)
description = (
"The process begins when the customer submits an application. After submission, "
"the application is reviewed by the credit officer. If the application is approved, "
"the loan is disbursed. Otherwise, a rejection letter is sent. The process ends."
)
# Gemma 2 uses a single user turn without a separate system role
messages = [{"role": "user", "content": instruction + description}]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer([prompt], return_tensors="pt").to(model.device)
with torch.no_grad():
out = model.generate(**inputs, max_new_tokens=2048, temperature=0.1, top_p=1.0, do_sample=True)
dot_code = tokenizer.decode(out[0][inputs.input_ids.shape[-1]:], skip_special_tokens=True)
print(dot_code)
Training
Trained with LLaMA-Factory using LoRA on Gemma-2 9B Instruct. Detailed hyperparameters are reported in the PROFES 2025 paper. Training data: 21.5k cleaned instruction–input–output triples from MaD, split 80/10/10 for train/validation/test. The full splits are available at gcelikmasat-work/BPMN-IT-Dataset.
Limitations
- Scope. Control-flow slice of BPMN only (tasks, events, sequence flows, AND/XOR gateways). No pools, lanes, message flows, data objects, or sub-processes.
- Language. English only.
- Parameter efficiency. At 9B parameters, this model is roughly twice the size of the Qwen3-4B successor for comparable accuracy. For deployment-constrained settings, the 4B successor is preferred.
- Semantic equivalence. Structural similarity does not imply semantic equivalence, especially when input descriptions are ambiguous.
Citation
If you use this model, please cite the PROFES 2025 paper:
@inproceedings{celikmasat2025bpmg,
title = {Generating Business Process Models with Open Source Large Language Models using Instruction Tuning},
author = {{\c{C}}elikmasat, G{\"o}kberk and {\"O}zg{\"o}vde, Atay and Aydemir, Fatma Ba{\c{s}}ak},
booktitle = {Product-Focused Software Process Improvement (PROFES 2025)},
series = {Lecture Notes in Computer Science},
pages = {269--284},
year = {2025},
publisher = {Springer},
doi = {10.1007/978-3-032-12089-2_17},
url = {https://doi.org/10.1007/978-3-032-12089-2_17}
}
If you are comparing against this model as a baseline in a follow-up study, please also cite the journal extension:
@article{celikmasat2026instrubpm,
title = {InstruBPM: Instruction-Tuning Open-Weight Language Models for BPMN Model Generation},
author = {{\c{C}}elikmasat, G{\"o}kberk and {\"O}zg{\"o}vde, Atay and Aydemir, Fatma Ba{\c{s}}ak},
journal = {Software and Systems Modeling},
year = {2026},
note = {Under review. arXiv:2512.12063},
url = {https://arxiv.org/abs/2512.12063}
}
Please also cite the source dataset:
@inproceedings{li2023mad,
title = {{MaD}: A Dataset for Interview-based {BPM} in Business Process Management},
author = {Li, Xiang and Ni, Lijuan and Li, Ran and Liu, Jiafei and Zhang, Ming},
booktitle = {2023 International Joint Conference on Neural Networks (IJCNN)},
pages = {1--8},
year = {2023},
publisher = {IEEE}
}
License
Released under the Gemma license, inherited from the base model (google/gemma-2-9b-it). Use is subject to Google's Gemma Prohibited Use Policy. The training data is distributed separately under the terms of the MaD dataset.