all-MiniLM-L6-v49-pair_score

This is a sentence-transformers model finetuned from sentence-transformers/all-MiniLM-L6-v2 on the pairs_with_scores_v41 dataset. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Type: Sentence Transformer
Base model: sentence-transformers/all-MiniLM-L6-v2
Maximum Sequence Length: 256 tokens
Output Dimensionality: 384 dimensions
Similarity Function: Cosine Similarity
Training Dataset:
- pairs_with_scores_v41
Language: en
License: apache-2.0

Model Sources

Documentation: Sentence Transformers Documentation
Repository: Sentence Transformers on GitHub
Hugging Face: Sentence Transformers on Hugging Face

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 256, 'do_lower_case': False, 'architecture': 'BertModel'})
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    'black t shirt',
    'classic 76 119 285 1.5 l',
    'nan lip moisturizer',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[ 1.0000, -0.0933, -0.1401],
#         [-0.0933,  1.0000,  0.0655],
#         [-0.1401,  0.0655,  1.0000]])

Training Details

Training Dataset

pairs_with_scores_v41

Dataset: pairs_with_scores_v41 at c0ff11e
Size: 29,450,829 training samples
Columns: sentence1, sentence2, and score
Approximate statistics based on the first 1000 samples:
sentence1 sentence2 score
type string string float
details
min: 3 tokens
mean: 6.18 tokens
max: 35 tokens

min: 3 tokens
mean: 43.33 tokens
max: 234 tokens

min: 0.0
mean: 0.01
max: 1.0

	sentence1	sentence2	score
type	string	string	float
details	min: 3 tokens mean: 6.18 tokens max: 35 tokens	min: 3 tokens mean: 43.33 tokens max: 234 tokens	min: 0.0 mean: 0.01 max: 1.0

Samples:

sentence1	sentence2	score
`baked sunny side up eggs`	`home decor accessory deer christmas ornament - 3 silver ornament silver deer ornament artshop ornament deer decoration silver christmas silver deer christmas ornament 1 per box`	`0.0`
`fabric`	`soup cream of red beets cream beets red beets beetroots cream of red beetroots 235 calories / serving container. all soups are made with coconut cream olive oil vegetable broth.`	`0.0`
`tea bags`	`pure plast - cling film 40 cm x 20 m - 1 pcs cling wrap and foil pure plast cling film 20 m 1 pcs 40 cm`	`0.0`

Loss: CoSENTLoss with these parameters:

{
    "scale": 20.0,
    "similarity_fct": "pairwise_cos_sim"
}

Evaluation Dataset

pairs_with_scores_v41

Dataset: pairs_with_scores_v41 at c0ff11e
Size: 147,995 evaluation samples
Columns: sentence1, sentence2, and score
Approximate statistics based on the first 1000 samples:
sentence1 sentence2 score
type string string float
details
min: 3 tokens
mean: 6.16 tokens
max: 28 tokens

min: 3 tokens
mean: 45.12 tokens
max: 237 tokens

min: 0.0
mean: 0.02
max: 1.0

	sentence1	sentence2	score
type	string	string	float
details	min: 3 tokens mean: 6.16 tokens max: 28 tokens	min: 3 tokens mean: 45.12 tokens max: 237 tokens	min: 0.0 mean: 0.02 max: 1.0

Samples:

sentence1	sentence2	score
`teddy`	`speaker and sub jbl authentics 300 - black and sub authentics black jbl`	`0.0`
`collagenrich dog food`	`dettol`	`0.0`
`pet deodorizing spray`	`leather suspender`	`0.0`

Loss: CoSENTLoss with these parameters:

{
    "scale": 20.0,
    "similarity_fct": "pairwise_cos_sim"
}

Training Hyperparameters

Non-Default Hyperparameters

eval_strategy: steps
per_device_train_batch_size: 64
per_device_eval_batch_size: 64
learning_rate: 2e-05
num_train_epochs: 1
warmup_ratio: 0.1
fp16: True

All Hyperparameters

Click to expand

overwrite_output_dir: False
do_predict: False
eval_strategy: steps
prediction_loss_only: True
per_device_train_batch_size: 64
per_device_eval_batch_size: 64
per_gpu_train_batch_size: None
per_gpu_eval_batch_size: None
gradient_accumulation_steps: 1
eval_accumulation_steps: None
torch_empty_cache_steps: None
learning_rate: 2e-05
weight_decay: 0.0
adam_beta1: 0.9
adam_beta2: 0.999
adam_epsilon: 1e-08
max_grad_norm: 1.0
num_train_epochs: 1
max_steps: -1
lr_scheduler_type: linear
lr_scheduler_kwargs: {}
warmup_ratio: 0.1
warmup_steps: 0
log_level: passive
log_level_replica: warning
log_on_each_node: True
logging_nan_inf_filter: True
save_safetensors: True
save_on_each_node: False
save_only_model: False
restore_callback_states_from_checkpoint: False
no_cuda: False
use_cpu: False
use_mps_device: False
seed: 42
data_seed: None
jit_mode_eval: False
use_ipex: False
bf16: False
fp16: True
fp16_opt_level: O1
half_precision_backend: auto
bf16_full_eval: False
fp16_full_eval: False
tf32: None
local_rank: 0
ddp_backend: None
tpu_num_cores: None
tpu_metrics_debug: False
debug: []
dataloader_drop_last: False
dataloader_num_workers: 0
dataloader_prefetch_factor: None
past_index: -1
disable_tqdm: False
remove_unused_columns: True
label_names: None
load_best_model_at_end: False
ignore_data_skip: False
fsdp: []
fsdp_min_num_params: 0
fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
fsdp_transformer_layer_cls_to_wrap: None
accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
deepspeed: None
label_smoothing_factor: 0.0
optim: adamw_torch
optim_args: None
adafactor: False
group_by_length: False
length_column_name: length
ddp_find_unused_parameters: None
ddp_bucket_cap_mb: None
ddp_broadcast_buffers: False
dataloader_pin_memory: True
dataloader_persistent_workers: False
skip_memory_metrics: True
use_legacy_prediction_loop: False
push_to_hub: False
resume_from_checkpoint: None
hub_model_id: None
hub_strategy: every_save
hub_private_repo: None
hub_always_push: False
hub_revision: None
gradient_checkpointing: False
gradient_checkpointing_kwargs: None
include_inputs_for_metrics: False
include_for_metrics: []
eval_do_concat_batches: True
fp16_backend: auto
push_to_hub_model_id: None
push_to_hub_organization: None
mp_parameters:
auto_find_batch_size: False
full_determinism: False
torchdynamo: None
ray_scope: last
ddp_timeout: 1800
torch_compile: False
torch_compile_backend: None
torch_compile_mode: None
include_tokens_per_second: False
include_num_input_tokens_seen: False
neftune_noise_alpha: None
optim_target_modules: None
batch_eval_metrics: False
eval_on_start: False
use_liger_kernel: False
liger_kernel_config: None
eval_use_gather_object: False
average_tokens_across_devices: False
prompts: None
batch_sampler: batch_sampler
multi_dataset_batch_sampler: proportional
router_mapping: {}
learning_rate_mapping: {}

Training Logs

Click to expand

Epoch	Step	Training Loss
0.9564	440100	0.3614
0.9566	440200	0.3618
0.9568	440300	0.4087
0.9570	440400	0.5208
0.9573	440500	0.201
0.9575	440600	0.3574
0.9577	440700	0.3319
0.9579	440800	0.5622
0.9581	440900	0.4537
0.9583	441000	0.2788
0.9586	441100	0.3204
0.9588	441200	0.1998
0.9590	441300	0.1835
0.9592	441400	0.2816
0.9594	441500	0.3626
0.9596	441600	0.3397
0.9599	441700	0.2483
0.9601	441800	0.4106
0.9603	441900	0.4449
0.9605	442000	0.3463
0.9607	442100	0.3919
0.9609	442200	0.4745
0.9612	442300	0.163
0.9614	442400	0.2097
0.9616	442500	0.3953
0.9618	442600	0.3777
0.9620	442700	0.2438
0.9623	442800	0.252
0.9625	442900	0.1972
0.9627	443000	0.4356
0.9629	443100	0.2066
0.9631	443200	0.2555
0.9633	443300	0.437
0.9636	443400	0.454
0.9638	443500	0.3138
0.9640	443600	0.4779
0.9642	443700	0.3901
0.9644	443800	0.51
0.9646	443900	0.3963
0.9649	444000	0.2881
0.9651	444100	0.2678
0.9653	444200	0.3198
0.9655	444300	0.4014
0.9657	444400	0.3307
0.9659	444500	0.3433
0.9662	444600	0.2724
0.9664	444700	0.2165
0.9666	444800	0.4965
0.9668	444900	0.3912
0.9670	445000	0.3634
0.9673	445100	0.4186
0.9675	445200	0.3839
0.9677	445300	0.3224
0.9679	445400	0.4699
0.9681	445500	0.2369
0.9683	445600	0.305
0.9686	445700	0.3043
0.9688	445800	0.3976
0.9690	445900	0.3347
0.9692	446000	0.0874
0.9694	446100	0.5428
0.9696	446200	0.3654
0.9699	446300	0.3433
0.9701	446400	0.4929
0.9703	446500	0.3115
0.9705	446600	0.2371
0.9707	446700	0.3866
0.9709	446800	0.2423
0.9712	446900	0.3694
0.9714	447000	0.5806
0.9716	447100	0.4009
0.9718	447200	0.4734
0.9720	447300	0.3467
0.9722	447400	0.3424
0.9725	447500	0.3567
0.9727	447600	0.222
0.9729	447700	0.3959
0.9731	447800	0.2983
0.9733	447900	0.1348
0.9736	448000	0.3969
0.9738	448100	0.3171
0.9740	448200	0.3058
0.9742	448300	0.3031
0.9744	448400	0.1975
0.9746	448500	0.5005
0.9749	448600	0.3297
0.9751	448700	0.3869
0.9753	448800	0.3293
0.9755	448900	0.3119
0.9757	449000	0.4127
0.9759	449100	0.3758
0.9762	449200	0.3959
0.9764	449300	0.2
0.9766	449400	0.2102
0.9768	449500	0.5711
0.9770	449600	0.6681
0.9772	449700	0.4882
0.9775	449800	0.2815
0.9777	449900	0.2165
0.9779	450000	0.2737
0.9781	450100	0.4616
0.9783	450200	0.3245
0.9786	450300	0.2996
0.9788	450400	0.1052
0.9790	450500	0.5346
0.9792	450600	0.2717
0.9794	450700	0.2122
0.9796	450800	0.4603
0.9799	450900	0.6163
0.9801	451000	0.4955
0.9803	451100	0.4505
0.9805	451200	0.4884
0.9807	451300	0.3573
0.9809	451400	0.3374
0.9812	451500	0.5565
0.9814	451600	0.5794
0.9816	451700	0.7069
0.9818	451800	0.2379
0.9820	451900	0.2543
0.9822	452000	0.2024
0.9825	452100	0.1231
0.9827	452200	0.3766
0.9829	452300	0.4853
0.9831	452400	0.4873
0.9833	452500	0.4789
0.9835	452600	0.3463
0.9838	452700	0.292
0.9840	452800	0.3134
0.9842	452900	0.3785
0.9844	453000	0.3129
0.9846	453100	0.3602
0.9849	453200	0.3
0.9851	453300	0.2282
0.9853	453400	0.1827
0.9855	453500	0.4163
0.9857	453600	0.242
0.9859	453700	0.4047
0.9862	453800	0.5129
0.9864	453900	0.4737
0.9866	454000	0.2933
0.9868	454100	0.2462
0.9870	454200	0.2297
0.9872	454300	0.3121
0.9875	454400	0.3317
0.9877	454500	0.2139
0.9879	454600	0.3243
0.9881	454700	0.2504
0.9883	454800	0.248
0.9885	454900	0.524
0.9888	455000	0.5411
0.9890	455100	0.2952
0.9892	455200	0.4317
0.9894	455300	0.3344
0.9896	455400	0.3379
0.9899	455500	0.1478
0.9901	455600	0.581
0.9903	455700	0.2967
0.9905	455800	0.2757
0.9907	455900	0.2212
0.9909	456000	0.3731
0.9912	456100	0.2975
0.9914	456200	0.4897
0.9916	456300	0.4707
0.9918	456400	0.4309
0.9920	456500	0.3329
0.9922	456600	0.4147
0.9925	456700	0.1688
0.9927	456800	0.464
0.9929	456900	0.2772
0.9931	457000	0.1759
0.9933	457100	0.4468
0.9935	457200	0.3676
0.9938	457300	0.1651
0.9940	457400	0.2744
0.9942	457500	0.4478
0.9944	457600	0.2895
0.9946	457700	0.3736
0.9948	457800	0.5262
0.9951	457900	0.406
0.9953	458000	0.4381
0.9955	458100	0.5408
0.9957	458200	0.4406
0.9959	458300	0.4051
0.9962	458400	0.3769
0.9964	458500	0.4276
0.9966	458600	0.2825
0.9968	458700	0.2271
0.9970	458800	0.3214
0.9972	458900	0.4274
0.9975	459000	0.332
0.9977	459100	0.4695
0.9979	459200	0.2942
0.9981	459300	0.3683
0.9983	459400	0.3422
0.9985	459500	0.3291
0.9988	459600	0.4092
0.9990	459700	0.4295
0.9992	459800	0.2956
0.9994	459900	0.4245
0.9996	460000	0.2533
0.9998	460100	0.4611

Framework Versions

Python: 3.12.3
Sentence Transformers: 5.1.0
Transformers: 4.55.4
PyTorch: 2.6.0+cu124
Accelerate: 1.10.1
Datasets: 4.0.0
Tokenizers: 0.21.4

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

CoSENTLoss

@online{kexuefm-8847,
    title={CoSENT: A more efficient sentence vector scheme than Sentence-BERT},
    author={Su Jianlin},
    year={2022},
    month={Jan},
    url={https://kexue.fm/archives/8847},
}

Downloads last month: -

Safetensors

Model size

22.7M params

Tensor type

F32

Model tree for KhaledReda/all-MiniLM-L6-v49-pair_score

Base model

sentence-transformers/all-MiniLM-L6-v2

Finetuned

(819)

this model

Dataset used to train KhaledReda/all-MiniLM-L6-v49-pair_score

Paper for KhaledReda/all-MiniLM-L6-v49-pair_score

Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks

Paper • 1908.10084 • Published Aug 27, 2019 • 12