Sentence Similarity
sentence-transformers
Safetensors
English
bert
bible
cross-translation
semantic-similarity
embeddings
Eval Results (legacy)
text-embeddings-inference
Instructions to use LoveJesus/biblical-cross-translation-chirho with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use LoveJesus/biblical-cross-translation-chirho with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("LoveJesus/biblical-cross-translation-chirho") sentences = [ "That is a happy person", "That is a happy dog", "That is a very happy person", "Today is a sunny day" ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [4, 4] - Notebooks
- Google Colab
- Kaggle
File size: 3,035 Bytes
c09ec07 6b90dfa c09ec07 6b90dfa c09ec07 3f6d920 c09ec07 6b90dfa c09ec07 6b90dfa c09ec07 6b90dfa c09ec07 6b90dfa c09ec07 6b90dfa c09ec07 6b90dfa c09ec07 6b90dfa c09ec07 6b90dfa c09ec07 6b90dfa c09ec07 6b90dfa c09ec07 6b90dfa c09ec07 3f6d920 6b90dfa | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 | ---
language:
- en
tags:
- sentence-transformers
- bible
- cross-translation
- semantic-similarity
- embeddings
license: mit
datasets:
- LoveJesus/biblical-embedding-dataset-chirho
pipeline_tag: sentence-similarity
model-index:
- name: biblical-cross-translation-chirho
results:
- task:
type: sentence-similarity
name: Cross-Translation Semantic Similarity
dataset:
type: LoveJesus/biblical-embedding-dataset-chirho
name: Biblical Embedding Dataset (Chirho)
metrics:
- type: accuracy
value: 0.9988
name: Accuracy@0.5
- type: roc_auc
value: 1.0000
name: ROC AUC
- type: spearmanr
value: 0.4915
name: Spearman Correlation
---
<!-- For God so loved the world that he gave his only begotten Son, -->
<!-- that whoever believes in him should not perish but have eternal life. - John 3:16 -->
# Cross-Translation Bible Embeddings
A sentence transformer fine-tuned to create a shared embedding space where semantically
equivalent Bible verses across different translations map to nearby vectors.
## Usage
```python
from sentence_transformers import SentenceTransformer
from sentence_transformers.util import cos_sim
model = SentenceTransformer("LoveJesus/biblical-cross-translation-chirho")
verses = [
"[KJV] In the beginning God created the heaven and the earth.",
"[BBE] At the first God made the heaven and the earth.",
"[KJV] And the earth was without form, and void;",
]
embeddings = model.encode(verses)
similarities = cos_sim(embeddings, embeddings)
print(similarities)
# Gen 1:1 KJV vs Gen 1:1 BBE: ~0.95 (same verse, different translation)
# Gen 1:1 KJV vs Gen 1:2 KJV: ~0.30 (different verses)
```
## Training
- **Base model**: paraphrase-multilingual-MiniLM-L12-v2 (118M params, 384-dim)
- **Training**: Contrastive learning (CosineSimilarityLoss) on ~300K verse pairs
- **Translations**: KJV, ASV, YLT, BBE, WEB (all public domain)
- **Positive pairs**: Same verse in different translations
- **Negative pairs**: Different verses from the same translation
## Part of bible.systems
This is model 5 of 5 in the [bible.systems](https://bible.systems) ML pipeline.
## Evaluation Results
Evaluated on a held-out test set of cross-translation verse pairs.
| Metric | Score |
|--------|-------|
| **Accuracy@0.5** (cosine sim threshold) | **0.9988** |
| **ROC AUC** | **1.0000** |
| **Spearman Correlation** | **0.4915** |
| **Avg Positive Similarity** | 0.9841 |
| **Avg Negative Similarity** | 0.0359 |
| **Similarity Gap** (pos - neg) | **0.9482** |
> The model achieves near-perfect discrimination between same-verse pairs across translations (high positive similarity) and different-verse pairs (low negative similarity), with a gap of 0.95. The Spearman correlation is moderate because within-class similarity variance is low (most positive pairs cluster near 0.98).
---
*For God so loved the world...* — John 3:16
|