all-MiniLM-L6-v49-pair_score

This is a sentence-transformers model finetuned from sentence-transformers/all-MiniLM-L6-v2 on the pairs_with_scores_v41 dataset. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 256, 'do_lower_case': False, 'architecture': 'BertModel'})
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    'black t shirt',
    'classic 76 119 285 1.5 l',
    'nan lip moisturizer',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[ 1.0000, -0.0933, -0.1401],
#         [-0.0933,  1.0000,  0.0655],
#         [-0.1401,  0.0655,  1.0000]])

Training Details

Training Dataset

pairs_with_scores_v41

  • Dataset: pairs_with_scores_v41 at c0ff11e
  • Size: 29,450,829 training samples
  • Columns: sentence1, sentence2, and score
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2 score
    type string string float
    details
    • min: 3 tokens
    • mean: 6.18 tokens
    • max: 35 tokens
    • min: 3 tokens
    • mean: 43.33 tokens
    • max: 234 tokens
    • min: 0.0
    • mean: 0.01
    • max: 1.0
  • Samples:
    sentence1 sentence2 score
    baked sunny side up eggs home decor accessory deer christmas ornament - 3 silver ornament silver deer ornament artshop ornament deer decoration silver christmas silver deer christmas ornament 1 per box 0.0
    fabric soup cream of red beets cream beets red beets beetroots cream of red beetroots 235 calories / serving container. all soups are made with coconut cream olive oil vegetable broth. 0.0
    tea bags pure plast - cling film 40 cm x 20 m - 1 pcs cling wrap and foil pure plast cling film 20 m 1 pcs 40 cm 0.0
  • Loss: CoSENTLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "pairwise_cos_sim"
    }
    

Evaluation Dataset

pairs_with_scores_v41

  • Dataset: pairs_with_scores_v41 at c0ff11e
  • Size: 147,995 evaluation samples
  • Columns: sentence1, sentence2, and score
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2 score
    type string string float
    details
    • min: 3 tokens
    • mean: 6.16 tokens
    • max: 28 tokens
    • min: 3 tokens
    • mean: 45.12 tokens
    • max: 237 tokens
    • min: 0.0
    • mean: 0.02
    • max: 1.0
  • Samples:
    sentence1 sentence2 score
    teddy speaker and sub jbl authentics 300 - black and sub authentics black jbl 0.0
    collagenrich dog food dettol 0.0
    pet deodorizing spray leather suspender 0.0
  • Loss: CoSENTLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "pairwise_cos_sim"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 64
  • per_device_eval_batch_size: 64
  • learning_rate: 2e-05
  • num_train_epochs: 1
  • warmup_ratio: 0.1
  • fp16: True

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 64
  • per_device_eval_batch_size: 64
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 1
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Click to expand
Epoch Step Training Loss
0.9564 440100 0.3614
0.9566 440200 0.3618
0.9568 440300 0.4087
0.9570 440400 0.5208
0.9573 440500 0.201
0.9575 440600 0.3574
0.9577 440700 0.3319
0.9579 440800 0.5622
0.9581 440900 0.4537
0.9583 441000 0.2788
0.9586 441100 0.3204
0.9588 441200 0.1998
0.9590 441300 0.1835
0.9592 441400 0.2816
0.9594 441500 0.3626
0.9596 441600 0.3397
0.9599 441700 0.2483
0.9601 441800 0.4106
0.9603 441900 0.4449
0.9605 442000 0.3463
0.9607 442100 0.3919
0.9609 442200 0.4745
0.9612 442300 0.163
0.9614 442400 0.2097
0.9616 442500 0.3953
0.9618 442600 0.3777
0.9620 442700 0.2438
0.9623 442800 0.252
0.9625 442900 0.1972
0.9627 443000 0.4356
0.9629 443100 0.2066
0.9631 443200 0.2555
0.9633 443300 0.437
0.9636 443400 0.454
0.9638 443500 0.3138
0.9640 443600 0.4779
0.9642 443700 0.3901
0.9644 443800 0.51
0.9646 443900 0.3963
0.9649 444000 0.2881
0.9651 444100 0.2678
0.9653 444200 0.3198
0.9655 444300 0.4014
0.9657 444400 0.3307
0.9659 444500 0.3433
0.9662 444600 0.2724
0.9664 444700 0.2165
0.9666 444800 0.4965
0.9668 444900 0.3912
0.9670 445000 0.3634
0.9673 445100 0.4186
0.9675 445200 0.3839
0.9677 445300 0.3224
0.9679 445400 0.4699
0.9681 445500 0.2369
0.9683 445600 0.305
0.9686 445700 0.3043
0.9688 445800 0.3976
0.9690 445900 0.3347
0.9692 446000 0.0874
0.9694 446100 0.5428
0.9696 446200 0.3654
0.9699 446300 0.3433
0.9701 446400 0.4929
0.9703 446500 0.3115
0.9705 446600 0.2371
0.9707 446700 0.3866
0.9709 446800 0.2423
0.9712 446900 0.3694
0.9714 447000 0.5806
0.9716 447100 0.4009
0.9718 447200 0.4734
0.9720 447300 0.3467
0.9722 447400 0.3424
0.9725 447500 0.3567
0.9727 447600 0.222
0.9729 447700 0.3959
0.9731 447800 0.2983
0.9733 447900 0.1348
0.9736 448000 0.3969
0.9738 448100 0.3171
0.9740 448200 0.3058
0.9742 448300 0.3031
0.9744 448400 0.1975
0.9746 448500 0.5005
0.9749 448600 0.3297
0.9751 448700 0.3869
0.9753 448800 0.3293
0.9755 448900 0.3119
0.9757 449000 0.4127
0.9759 449100 0.3758
0.9762 449200 0.3959
0.9764 449300 0.2
0.9766 449400 0.2102
0.9768 449500 0.5711
0.9770 449600 0.6681
0.9772 449700 0.4882
0.9775 449800 0.2815
0.9777 449900 0.2165
0.9779 450000 0.2737
0.9781 450100 0.4616
0.9783 450200 0.3245
0.9786 450300 0.2996
0.9788 450400 0.1052
0.9790 450500 0.5346
0.9792 450600 0.2717
0.9794 450700 0.2122
0.9796 450800 0.4603
0.9799 450900 0.6163
0.9801 451000 0.4955
0.9803 451100 0.4505
0.9805 451200 0.4884
0.9807 451300 0.3573
0.9809 451400 0.3374
0.9812 451500 0.5565
0.9814 451600 0.5794
0.9816 451700 0.7069
0.9818 451800 0.2379
0.9820 451900 0.2543
0.9822 452000 0.2024
0.9825 452100 0.1231
0.9827 452200 0.3766
0.9829 452300 0.4853
0.9831 452400 0.4873
0.9833 452500 0.4789
0.9835 452600 0.3463
0.9838 452700 0.292
0.9840 452800 0.3134
0.9842 452900 0.3785
0.9844 453000 0.3129
0.9846 453100 0.3602
0.9849 453200 0.3
0.9851 453300 0.2282
0.9853 453400 0.1827
0.9855 453500 0.4163
0.9857 453600 0.242
0.9859 453700 0.4047
0.9862 453800 0.5129
0.9864 453900 0.4737
0.9866 454000 0.2933
0.9868 454100 0.2462
0.9870 454200 0.2297
0.9872 454300 0.3121
0.9875 454400 0.3317
0.9877 454500 0.2139
0.9879 454600 0.3243
0.9881 454700 0.2504
0.9883 454800 0.248
0.9885 454900 0.524
0.9888 455000 0.5411
0.9890 455100 0.2952
0.9892 455200 0.4317
0.9894 455300 0.3344
0.9896 455400 0.3379
0.9899 455500 0.1478
0.9901 455600 0.581
0.9903 455700 0.2967
0.9905 455800 0.2757
0.9907 455900 0.2212
0.9909 456000 0.3731
0.9912 456100 0.2975
0.9914 456200 0.4897
0.9916 456300 0.4707
0.9918 456400 0.4309
0.9920 456500 0.3329
0.9922 456600 0.4147
0.9925 456700 0.1688
0.9927 456800 0.464
0.9929 456900 0.2772
0.9931 457000 0.1759
0.9933 457100 0.4468
0.9935 457200 0.3676
0.9938 457300 0.1651
0.9940 457400 0.2744
0.9942 457500 0.4478
0.9944 457600 0.2895
0.9946 457700 0.3736
0.9948 457800 0.5262
0.9951 457900 0.406
0.9953 458000 0.4381
0.9955 458100 0.5408
0.9957 458200 0.4406
0.9959 458300 0.4051
0.9962 458400 0.3769
0.9964 458500 0.4276
0.9966 458600 0.2825
0.9968 458700 0.2271
0.9970 458800 0.3214
0.9972 458900 0.4274
0.9975 459000 0.332
0.9977 459100 0.4695
0.9979 459200 0.2942
0.9981 459300 0.3683
0.9983 459400 0.3422
0.9985 459500 0.3291
0.9988 459600 0.4092
0.9990 459700 0.4295
0.9992 459800 0.2956
0.9994 459900 0.4245
0.9996 460000 0.2533
0.9998 460100 0.4611

Framework Versions

  • Python: 3.12.3
  • Sentence Transformers: 5.1.0
  • Transformers: 4.55.4
  • PyTorch: 2.6.0+cu124
  • Accelerate: 1.10.1
  • Datasets: 4.0.0
  • Tokenizers: 0.21.4

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

CoSENTLoss

@online{kexuefm-8847,
    title={CoSENT: A more efficient sentence vector scheme than Sentence-BERT},
    author={Su Jianlin},
    year={2022},
    month={Jan},
    url={https://kexue.fm/archives/8847},
}
Downloads last month
-
Safetensors
Model size
22.7M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for KhaledReda/all-MiniLM-L6-v49-pair_score

Finetuned
(819)
this model

Dataset used to train KhaledReda/all-MiniLM-L6-v49-pair_score

Paper for KhaledReda/all-MiniLM-L6-v49-pair_score