gemma-e4b-firefly
A LoRA fine-tune of Gemma 4 E4B (7.52B dense) for C/C++ vulnerability
classification. Given one C or C++ function, it returns a JSON object with
a binary label (clean | vulnerable) and a list of CWE identifiers.
This model is not a reasoner. Disable the Gemma 4 thinking channel
(chat_template_kwargs.enable_thinking=false) or you will get empty
responses — the JSON is otherwise absorbed into a <think> block.
Prompt format
System prompt (copy verbatim):
You are a security reviewer. Return JSON only with keys label and cwe_ids. The label field must be exactly "clean" or "vulnerable".
User message:
Project: <project-name>
Language: C/C++
Determine whether this function is vulnerable.
```c
<function source>
Expected response:
```json
{"label":"vulnerable","cwe_ids":["CWE-125"]}
Inference — llama.cpp
llama-server -m gemma-e4b-firefly-q4_k_m.gguf \
--host 127.0.0.1 --port 8080 \
-c 4096 --temp 0 --top-k 1 --top-p 1 -n 256
curl -s http://127.0.0.1:8080/v1/chat/completions \
-H 'Content-Type: application/json' \
-d '{
"messages": [
{"role": "system", "content": "You are a security reviewer. Return JSON only with keys label and cwe_ids. The label field must be exactly \"clean\" or \"vulnerable\"."},
{"role": "user", "content": "Project: core\nLanguage: C/C++\nDetermine whether this function is vulnerable.\n\n```c\n<paste function here>\n```"}
],
"temperature": 0.0,
"max_tokens": 128,
"chat_template_kwargs": {"enable_thinking": false}
}'
Inference — MLX (Apple Silicon)
import json
from mlx_lm import load, generate
model, tokenizer = load("trevon/gemma-e4b-firefly/mlx/gemma-e4b-firefly-4bit")
messages = [
{"role": "system", "content": "You are a security reviewer. Return JSON only with keys label and cwe_ids. The label field must be exactly \"clean\" or \"vulnerable\"."},
{"role": "user", "content": "Project: core\nLanguage: C/C++\nDetermine whether this function is vulnerable.\n\n```c\n<function source>\n```"},
]
prompt = tokenizer.apply_chat_template(
messages, add_generation_prompt=True,
enable_thinking=False,
)
out = generate(model, tokenizer, prompt=prompt, max_tokens=128)
# Model may wrap JSON in a ```json fence. Strip before json.loads if present.
print(out)
Evaluation
Six held-out gates from an internal C/C++ vulnerability benchmark, greedy
decode at T=0.0.
| gate | tuned label_acc | Δ vs base | tuned CWE top-1 |
|---|---|---|---|
| source_50_a | 0.680 | +0.04 | 0.103 |
| source_50_b | 0.660 | +0.10 | 0.069 |
| source_50_c | 0.700 | +0.08 | 0.000 |
| source_200_a | 0.635 | +0.06 | 0.043 |
| source_200_b | 0.605 | +0.07 | 0.026 |
| source_200_c | 0.665 | +0.075 | 0.026 |
Mean 200-row Δ: +0.068. No parse failures, no empty labels.
Files
| file | size | sha256 |
|---|---|---|
gemma-e4b-firefly-bf16.gguf |
14 GB | 27cd72a50756bf384724dd3c4590e184bee60162e9343d62e90151875f4eb69c |
gemma-e4b-firefly-q8_0.gguf |
7.5 GB | 1dea37d5b796f7771a4a5b12eea55e78d504f18605aa1acba729bb5289b1afbc |
gemma-e4b-firefly-q4_k_m.gguf |
5.0 GB | 0a1b5e91c9cef35add47b82033f7196f9a5774176e62e8ef382abab793a7a60e |
mlx/gemma-e4b-firefly-bf16/ |
14 GB | MLX bf16 |
mlx/gemma-e4b-firefly-mxfp8/ |
7.9 GB | MLX 8-bit (group size 32) |
mlx/gemma-e4b-firefly-4bit/ |
4.0 GB | MLX 4-bit (group size 64) |
Q4_K_M is the recommended quant for laptops and consumer GPUs.
Limitations
- C/C++ only. Not evaluated on other languages.
- Label accuracy ≈ 0.65. Research adapter, not a production classifier — use it as a ranking signal, not a verdict.
- Weak CWE top-1 (0.03–0.10). The model often picks a plausible but wrong CWE from the same family.
- No reasoning traces. JSON-only training means no explanations or follow-up Q&A.
- ~2k training context, so functions longer than ~1500 LoC are OOD.
Training
- Base:
google/gemma-4-e4b-it(mlx snapshotmlx-community/gemma-4-e4b-it-bf16) - LoRA: r=8, α=2 on q/k/v/o + gate/up/down, all 42 blocks, 500 iters, step 100 selected.
- Corpus: 48,734 rows, PrimeVul + BigVul with strict
{"label","cwe_ids"}targets, deduplicated against the eval benchmark bycode_sha256.
License
Inherits the Gemma Terms of Use.
- Downloads last month
- 53