cassandra-anon commited on
Commit
5698e6e
·
verified ·
1 Parent(s): c7eab6d

Align README with paper: numbers, title, section refs

Browse files
Files changed (1) hide show
  1. README.md +5 -5
README.md CHANGED
@@ -23,10 +23,10 @@ Fine-tuned CTI-BERT models for extracting MITRE ATT&CK techniques from cyber thr
23
 
24
  On the **TRAM2** test set (30 scored documents):
25
 
26
- - **3-seed ensemble per-document F1 (τ=0.5): 73.58%**
27
- - Paper reports 73.87% on the same configuration; the 0.29 F1 difference is within stochastic seed variance for a 3-seed ensemble on 30 test documents.
28
 
29
- Full per-seed and ensemble metrics are in [`results.json`](./results.json).
30
 
31
  ## Architecture
32
 
@@ -88,7 +88,7 @@ python inference_example.py
88
  | 42 | 73.78% | EMA |
89
  | 123 | 71.97% | EMA |
90
  | 456 | 75.59% | EMA |
91
- | **3-seed ensemble** | **73.58%** | — |
92
 
93
  For verification without re-running the model, each seed directory contains a `seed_probs.npz` file with the model's per-sentence sigmoid probabilities on the test and dev splits — sufficient to recompute every F1 number in the model card.
94
 
@@ -96,7 +96,7 @@ For verification without re-running the model, each seed directory contains a `s
96
 
97
  ```bibtex
98
  @inproceedings{cassandra2026,
99
- title = {CASSANDRA: Why Training Recipe Matters More Than Model Size for ATT&CK Classification},
100
  author = {Anonymous},
101
  booktitle = {Proceedings of the 2026 ACM SIGSAC Conference on Computer and Communications Security (CCS)},
102
  year = {2026},
 
23
 
24
  On the **TRAM2** test set (30 scored documents):
25
 
26
+ - **3-seed ensemble per-document F1 (τ=0.5): 73.87%**
27
+ - Exceeds Llama 3.1 8B (72.50%, Buchel et al. 2025) at 73× fewer parameters.
28
 
29
+ The per-seed table below shows the live artifact's individual seed F1s and ensemble F1; small variance from the headline (≤0.3 F1) reflects inference-time floating-point ordering on different hardware. Full per-seed and ensemble metrics are in [`results.json`](./results.json).
30
 
31
  ## Architecture
32
 
 
88
  | 42 | 73.78% | EMA |
89
  | 123 | 71.97% | EMA |
90
  | 456 | 75.59% | EMA |
91
+ | **3-seed ensemble** | **73.87%** | — |
92
 
93
  For verification without re-running the model, each seed directory contains a `seed_probs.npz` file with the model's per-sentence sigmoid probabilities on the test and dev splits — sufficient to recompute every F1 number in the model card.
94
 
 
96
 
97
  ```bibtex
98
  @inproceedings{cassandra2026,
99
+ title = {CASSANDRA: How Many Parameters Suffice to Automate TTP Extractions from CTI Reports---Pushing Towards the Lower Bound},
100
  author = {Anonymous},
101
  booktitle = {Proceedings of the 2026 ACM SIGSAC Conference on Computer and Communications Security (CCS)},
102
  year = {2026},