SentenceTransformer based on sentence-transformers/all-MiniLM-L6-v2

This is a sentence-transformers model finetuned from sentence-transformers/all-MiniLM-L6-v2. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for retrieval.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: sentence-transformers/all-MiniLM-L6-v2
  • Maximum Sequence Length: 256 tokens
  • Output Dimensionality: 384 dimensions
  • Similarity Function: Cosine Similarity
  • Supported Modality: Text

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'transformer_task': 'feature-extraction', 'modality_config': {'text': {'method': 'forward', 'method_output_name': 'last_hidden_state'}}, 'module_output_name': 'token_embeddings', 'architecture': 'BertModel'})
  (1): Pooling({'embedding_dimension': 384, 'pooling_mode': 'mean', 'include_prompt': True})
  (2): Normalize({})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("ronit01/rag_tuned_minilm_100")
# Run inference
sentences = [
    'What three vector store backends does RapidFire AI support and what modes of operation do they offer?',
    'RapidFire AI also supports external persistent vector stores beyond the default in-memory FAISS.\nThis allows you to scale to larger corpora, persist indexes across runs and experiments, and leverage managed vector DBMS services.\nAs of this writing, **Pinecone** (hosted serverless or pod-based) and **PostgreSQL PGVector** (self-hosted or managed) are supported.\n\nEach external store supports three modes of operation:\n\n- **Create mode:** Build a new index from base documents from within RapidFire AI itself and use it for RAG.\n- **Read mode:** Retrieve from a pre-existing index and use it for RAG. \n- **Update mode:** Add new content to an existing index from additional base documents from within RapidFire AI itself and use it for RAG. \n\nSee the :doc:`API: LangChain RAG Spec page</ragspecs>` for more details on how to specify these external vector stores.',
    '.. py:function:: __init__(self, experiment_name: str, mode: str = "fit", experiments_path: str = "./rapidfire_experiments") -> None\n\n\t:param experiment_name: Unique name for this experiment\n\t:type experiment_name: str\n\t\n\t:param mode: Mode of this experiment, either :code:`"fit"` or :code:`"eval"`; default is :code:`"fit"`\n\t:type mode: str\n\t\n\t:param experiments_path: Path to a folder to store this experiment\'s artifacts. Default is ``"./rapidfire_experiments"``)\n\t:type experiments_path: str, optional \n\n\t:return: None\n\t:rtype: None',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.9882, 0.2547],
#         [0.9882, 1.0000, 0.2603],
#         [0.2547, 0.2603, 1.0000]])

Training Details

Training Dataset

Unnamed Dataset

  • Size: 208 training samples
  • Columns: sentence_0, sentence_1, and label
  • Approximate statistics based on the first 208 samples:
    sentence_0 sentence_1 label
    type string string float
    details
    • min: 11 tokens
    • mean: 24.87 tokens
    • max: 34 tokens
    • min: 31 tokens
    • mean: 218.51 tokens
    • max: 256 tokens
    • min: 0.0
    • mean: 0.25
    • max: 1.0
  • Samples:
    sentence_0 sentence_1 label
    What arguments does the RFModelConfig class accept for defining a model configuration in RapidFire AI? :param search_cfg: The search algorithm type and its kwargs to use for retrieval of vectors/chunks, provided as a single dictionary. Must include a key :code:"type" with one of the following three options listed as value; default is :code:"similarity".
      * :code:`"similarity"`: Standard cosine similarity search.
      * :code:`"similarity_score_threshold"`: Similarity search with minimum score threshold (SST).
      * :code:`"mmr"`: Maximum Marginal Relevance (MMR) search for diversity.
    
      Additional parameters for search configuration depend on the type; the keys can include the following:
    
      * :code:`"k"`: Number of documents to retrieve. Default is 5.
      * :code:`"filter"`: Optional filter criteria function for search results.
      * :code:`"score_threshold"`: Only for SST. Minimum similarity score threshold. 
      * :code:`"fetch_k"`: Only for MMR. Number of documents to fetch before MMR reranking. Default is 20.
      * :code:`"lambda_mult"`: Only for MMR...</code>     | <code>0.0</code> |
    
    | How do reward functions work in RapidFire AI for GRPO training, and what arguments does TRL inject into them? | RapidFire AI supports both self-hosted open LLMs and closed model LLM APIs as the generator.
    As of this writing, it wraps around the model config of vLLM for the former and the OpenAI API for the latter.
    We plan to expand support for more generator plugins, including Gemini and Claude APIs, based on feedback.


    RFvLLMModelConfig
    ------

    This is a wrapper around vLLM's :class:config and :class:SamplingParams classes.
    The full list of their arguments are available on this page <https://docs.vllm.ai/en/latest/api/vllm/config/index.html>__
    and this page <https://docs.vllm.ai/en/v0.6.4/dev/sampling_params.html>__, respectively.

    The difference here is that the individual arguments (knobs) can be :class:List valued or
    :class:Range valued in an :class:RFvLLMModelConfig.
    That is how you can specify a base set of knob combinations from which a config group can
    be produced. Also read :doc:the Multi-Config Specification page</configs>.
    | 0.0 | | What are the three use case tutorials provided for RAG and context engineering, and what type of workflow does each demonstrate? | This use case notebook features an all-closed model API workflow, with Open AI calls used for both embedding for generation. So, you do not need a GPU to run this notebook. | 1.0 |
  • Loss: ContrastiveLoss with these parameters:
    {
        "distance_metric": "SiameseDistanceMetric.COSINE_DISTANCE",
        "margin": 0.5,
        "size_average": true
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • num_train_epochs: 100
  • multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand
  • do_predict: False
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 100
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: None
  • warmup_ratio: None
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • enable_jit_checkpoint: False
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • use_cpu: False
  • seed: 42
  • data_seed: None
  • bf16: False
  • fp16: False
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: -1
  • ddp_backend: None
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • parallelism_config: None
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • group_by_length: False
  • length_column_name: length
  • project: huggingface
  • trackio_space_id: trackio
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • auto_find_batch_size: False
  • full_determinism: False
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_num_input_tokens_seen: no
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: True
  • use_cache: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: round_robin
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Epoch Step Training Loss
38.4615 500 0.0022
76.9231 1000 0.0002

Training Time

  • Training: 3.6 minutes

Framework Versions

  • Python: 3.12.13
  • Sentence Transformers: 5.4.1
  • Transformers: 5.0.0
  • PyTorch: 2.10.0+cu128
  • Accelerate: 1.13.0
  • Datasets: 4.0.0
  • Tokenizers: 0.22.2

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

ContrastiveLoss

@inproceedings{hadsell2006dimensionality,
    author={Hadsell, R. and Chopra, S. and LeCun, Y.},
    booktitle={2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)},
    title={Dimensionality Reduction by Learning an Invariant Mapping},
    year={2006},
    volume={2},
    number={},
    pages={1735-1742},
    doi={10.1109/CVPR.2006.100}
}
Downloads last month
75
Safetensors
Model size
22.7M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ronit01/rag_tuned_minilm_100

Finetuned
(876)
this model

Paper for ronit01/rag_tuned_minilm_100