Sentence Similarity
sentence-transformers
Safetensors
English
bert
bible
cross-translation
semantic-similarity
embeddings
Eval Results (legacy)
text-embeddings-inference
Instructions to use LoveJesus/biblical-cross-translation-chirho with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use LoveJesus/biblical-cross-translation-chirho with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("LoveJesus/biblical-cross-translation-chirho") sentences = [ "That is a happy person", "That is a happy dog", "That is a very happy person", "Today is a sunny day" ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [4, 4] - Notebooks
- Google Colab
- Kaggle
metadata
language:
- en
tags:
- sentence-transformers
- bible
- cross-translation
- semantic-similarity
- embeddings
license: mit
datasets:
- LoveJesus/biblical-embedding-dataset-chirho
pipeline_tag: sentence-similarity
model-index:
- name: biblical-cross-translation-chirho
results:
- task:
type: sentence-similarity
name: Cross-Translation Semantic Similarity
dataset:
type: LoveJesus/biblical-embedding-dataset-chirho
name: Biblical Embedding Dataset (Chirho)
metrics:
- type: accuracy
value: 0.9988
name: Accuracy@0.5
- type: roc_auc
value: 1
name: ROC AUC
- type: spearmanr
value: 0.4915
name: Spearman Correlation
Cross-Translation Bible Embeddings
A sentence transformer fine-tuned to create a shared embedding space where semantically equivalent Bible verses across different translations map to nearby vectors.
Usage
from sentence_transformers import SentenceTransformer
from sentence_transformers.util import cos_sim
model = SentenceTransformer("LoveJesus/biblical-cross-translation-chirho")
verses = [
"[KJV] In the beginning God created the heaven and the earth.",
"[BBE] At the first God made the heaven and the earth.",
"[KJV] And the earth was without form, and void;",
]
embeddings = model.encode(verses)
similarities = cos_sim(embeddings, embeddings)
print(similarities)
# Gen 1:1 KJV vs Gen 1:1 BBE: ~0.95 (same verse, different translation)
# Gen 1:1 KJV vs Gen 1:2 KJV: ~0.30 (different verses)
Training
- Base model: paraphrase-multilingual-MiniLM-L12-v2 (118M params, 384-dim)
- Training: Contrastive learning (CosineSimilarityLoss) on ~300K verse pairs
- Translations: KJV, ASV, YLT, BBE, WEB (all public domain)
- Positive pairs: Same verse in different translations
- Negative pairs: Different verses from the same translation
Part of bible.systems
This is model 5 of 5 in the bible.systems ML pipeline.
Evaluation Results
Evaluated on a held-out test set of cross-translation verse pairs.
| Metric | Score |
|---|---|
| Accuracy@0.5 (cosine sim threshold) | 0.9988 |
| ROC AUC | 1.0000 |
| Spearman Correlation | 0.4915 |
| Avg Positive Similarity | 0.9841 |
| Avg Negative Similarity | 0.0359 |
| Similarity Gap (pos - neg) | 0.9482 |
The model achieves near-perfect discrimination between same-verse pairs across translations (high positive similarity) and different-verse pairs (low negative similarity), with a gap of 0.95. The Spearman correlation is moderate because within-class similarity variance is low (most positive pairs cluster near 0.98).
For God so loved the world... — John 3:16