Crossword Space: MPNet-base ADE

An Asymmetric Dual Encoder (ADE) for Italian crossword clue answering, trained with contrastive learning on clue-answer pairs.

This model projects crossword clues and candidate answers into a shared 768-dimensional latent space, enabling efficient retrieval via FAISS inner-product search.

Model Description

  • Architecture: Asymmetric Dual Encoder with two separate XLM-RoBERTa encoders (one for clues, one for answers), a shared LayerNorm, and a shared linear projection head.
  • Base encoder: sentence-transformers/paraphrase-multilingual-mpnet-base-v2 (278M parameters per tower)
  • Training objective: Symmetric contrastive loss (InfoNCE) with in-batch hard negative mining and learnable temperature
  • Embedding dimension: 768
  • Training data: Italian crossword clues and dictionary definitions paired with their answers/defined words

For more details, see the paper: Crossword Space: Latent Manifold Learning for Italian Crosswords and Beyond (CLiC-it 2025). This model corresponds to our best configuration, which includes dictionary items during training.

Usage

Loading the model

from transformers import AutoModel

model = AutoModel.from_pretrained(
    "cruciverb-it/crossword-space-mpnet-base-ade",
    trust_remote_code=True,
)

Or, if you have the crossword-space repository cloned locally:

from model import DualEncoderADE

model = DualEncoderADE.from_pretrained("cruciverb-it/crossword-space-mpnet-base-ade")

Full example

import torch
import torch.nn.functional as F
from transformers import AutoModel, AutoTokenizer

model = AutoModel.from_pretrained(
    "cruciverb-it/crossword-space-mpnet-base-ade",
    trust_remote_code=True,
)
tokenizer = AutoTokenizer.from_pretrained("cruciverb-it/crossword-space-mpnet-base-ade")
model.eval()

clues = ["Capitale d'Italia", "Fiume che attraversa Roma"]
answers = ["ROMA", "TEVERE"]

clue_enc = tokenizer(clues, padding=True, truncation=True, max_length=64, return_tensors="pt")
ans_enc = tokenizer(answers, padding=True, truncation=True, max_length=16, return_tensors="pt")

with torch.no_grad():
    clue_emb, ans_emb = model(
        def_input_ids=clue_enc["input_ids"],
        def_attention_mask=clue_enc["attention_mask"],
        ans_input_ids=ans_enc["input_ids"],
        ans_attention_mask=ans_enc["attention_mask"],
    )

# L2-normalize for cosine similarity / inner-product search
clue_emb = F.normalize(clue_emb, dim=-1)
ans_emb = F.normalize(ans_emb, dim=-1)

# Similarity matrix
similarity = clue_emb @ ans_emb.T
print(similarity)

Evaluation Results

Retrieval performance on four Italian test sets. Length-filtered metrics restrict the FAISS index to candidates matching the expected answer length, which is the standard setting for crossword solving.

Full Index

Test Set Acc@1 Acc@10 Acc@100 Acc@1000 MRR
Crossword 31.8 63.4 81.3 91.0 42.7
Dictionary 17.5 40.5 63.3 82.2 25.3
ONLI 11.8 34.9 61.0 81.5 19.7
Neologisms 9.0 25.0 59.0 82.0 14.6

Length-filtered Index

Test Set Acc@1 Acc@10 Acc@100 Acc@1000 MRR
Crossword 54.0 80.2 91.3 97.2 63.6
Dictionary 35.6 62.6 82.5 95.7 45.0
ONLI 36.0 65.0 84.3 95.8 45.8
Neologisms 28.0 60.0 84.0 94.0 38.1

Citation

@inproceedings{ciaccio-etal-2025-crossword-space,
    title = "Crossword Space: Latent Manifold Learning for Italian Crosswords and Beyond",
    author = "Ciaccio, Cristiano and Sarti, Gabriele and Miaschi, Alessio and Dell'Orletta, Felice",
    booktitle = "Proceedings of the Eleventh Italian Conference on Computational Linguistics (CLiC-it 2025)",
    year = "2025",
    url = "https://aclanthology.org/2025.clicit-1.26/"
}

License

CC BY 4.0

Downloads last month
425
Safetensors
Model size
0.6B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for cruciverb-it/crossword-space-mpnet-base-ade

Dataset used to train cruciverb-it/crossword-space-mpnet-base-ade

Evaluation results