Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Paper • 1908.10084 • Published • 13
This is a sentence-transformers model finetuned from sentence-transformers/all-MiniLM-L6-v2. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for retrieval.
SentenceTransformer(
(0): Transformer({'transformer_task': 'feature-extraction', 'modality_config': {'text': {'method': 'forward', 'method_output_name': 'last_hidden_state'}}, 'module_output_name': 'token_embeddings', 'architecture': 'BertModel'})
(1): Pooling({'embedding_dimension': 384, 'pooling_mode': 'mean', 'include_prompt': True})
(2): Normalize({})
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("ronit01/rag_tuned_minilm_100")
# Run inference
sentences = [
'What three vector store backends does RapidFire AI support and what modes of operation do they offer?',
'RapidFire AI also supports external persistent vector stores beyond the default in-memory FAISS.\nThis allows you to scale to larger corpora, persist indexes across runs and experiments, and leverage managed vector DBMS services.\nAs of this writing, **Pinecone** (hosted serverless or pod-based) and **PostgreSQL PGVector** (self-hosted or managed) are supported.\n\nEach external store supports three modes of operation:\n\n- **Create mode:** Build a new index from base documents from within RapidFire AI itself and use it for RAG.\n- **Read mode:** Retrieve from a pre-existing index and use it for RAG. \n- **Update mode:** Add new content to an existing index from additional base documents from within RapidFire AI itself and use it for RAG. \n\nSee the :doc:`API: LangChain RAG Spec page</ragspecs>` for more details on how to specify these external vector stores.',
'.. py:function:: __init__(self, experiment_name: str, mode: str = "fit", experiments_path: str = "./rapidfire_experiments") -> None\n\n\t:param experiment_name: Unique name for this experiment\n\t:type experiment_name: str\n\t\n\t:param mode: Mode of this experiment, either :code:`"fit"` or :code:`"eval"`; default is :code:`"fit"`\n\t:type mode: str\n\t\n\t:param experiments_path: Path to a folder to store this experiment\'s artifacts. Default is ``"./rapidfire_experiments"``)\n\t:type experiments_path: str, optional \n\n\t:return: None\n\t:rtype: None',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.9882, 0.2547],
# [0.9882, 1.0000, 0.2603],
# [0.2547, 0.2603, 1.0000]])
sentence_0, sentence_1, and label| sentence_0 | sentence_1 | label | |
|---|---|---|---|
| type | string | string | float |
| details |
|
|
|
| sentence_0 | sentence_1 | label |
|---|---|---|
What arguments does the RFModelConfig class accept for defining a model configuration in RapidFire AI? |
:param search_cfg: The search algorithm type and its kwargs to use for retrieval of vectors/chunks, provided as a single dictionary. Must include a key :code: |
* :code:`"similarity"`: Standard cosine similarity search.
* :code:`"similarity_score_threshold"`: Similarity search with minimum score threshold (SST).
* :code:`"mmr"`: Maximum Marginal Relevance (MMR) search for diversity.
Additional parameters for search configuration depend on the type; the keys can include the following:
* :code:`"k"`: Number of documents to retrieve. Default is 5.
* :code:`"filter"`: Optional filter criteria function for search results.
* :code:`"score_threshold"`: Only for SST. Minimum similarity score threshold.
* :code:`"fetch_k"`: Only for MMR. Number of documents to fetch before MMR reranking. Default is 20.
* :code:`"lambda_mult"`: Only for MMR...</code> | <code>0.0</code> |
| How do reward functions work in RapidFire AI for GRPO training, and what arguments does TRL inject into them? | RapidFire AI supports both self-hosted open LLMs and closed model LLM APIs as the generator.
As of this writing, it wraps around the model config of vLLM for the former and the OpenAI API for the latter.
We plan to expand support for more generator plugins, including Gemini and Claude APIs, based on feedback.
RFvLLMModelConfig
------
This is a wrapper around vLLM's :class:config and :class:SamplingParams classes.
The full list of their arguments are available on this page <https://docs.vllm.ai/en/latest/api/vllm/config/index.html>__
and this page <https://docs.vllm.ai/en/v0.6.4/dev/sampling_params.html>__, respectively.
The difference here is that the individual arguments (knobs) can be :class:List valued or
:class:Range valued in an :class:RFvLLMModelConfig.
That is how you can specify a base set of knob combinations from which a config group can
be produced. Also read :doc:the Multi-Config Specification page</configs>. | 0.0 |
| What are the three use case tutorials provided for RAG and context engineering, and what type of workflow does each demonstrate? | This use case notebook features an all-closed model API workflow, with Open AI calls used for both embedding for generation. So, you do not need a GPU to run this notebook. | 1.0 |ContrastiveLoss with these parameters:{
"distance_metric": "SiameseDistanceMetric.COSINE_DISTANCE",
"margin": 0.5,
"size_average": true
}
per_device_train_batch_size: 16per_device_eval_batch_size: 16num_train_epochs: 100multi_dataset_batch_sampler: round_robindo_predict: Falseprediction_loss_only: Trueper_device_train_batch_size: 16per_device_eval_batch_size: 16gradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 5e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1num_train_epochs: 100max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: Nonewarmup_ratio: Nonewarmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Trueenable_jit_checkpoint: Falsesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseuse_cpu: Falseseed: 42data_seed: Nonebf16: Falsefp16: Falsebf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: -1ddp_backend: Nonedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonedisable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}parallelism_config: Nonedeepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torch_fusedoptim_args: Nonegroup_by_length: Falselength_column_name: lengthproject: huggingfacetrackio_space_id: trackioddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Truepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsehub_revision: Nonegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_for_metrics: []eval_do_concat_batches: Trueauto_find_batch_size: Falsefull_determinism: Falseddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneinclude_num_input_tokens_seen: noneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseliger_kernel_config: Noneeval_use_gather_object: Falseaverage_tokens_across_devices: Trueuse_cache: Falseprompts: Nonebatch_sampler: batch_samplermulti_dataset_batch_sampler: round_robinrouter_mapping: {}learning_rate_mapping: {}| Epoch | Step | Training Loss |
|---|---|---|
| 38.4615 | 500 | 0.0022 |
| 76.9231 | 1000 | 0.0002 |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
@inproceedings{hadsell2006dimensionality,
author={Hadsell, R. and Chopra, S. and LeCun, Y.},
booktitle={2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)},
title={Dimensionality Reduction by Learning an Invariant Mapping},
year={2006},
volume={2},
number={},
pages={1735-1742},
doi={10.1109/CVPR.2006.100}
}
Base model
sentence-transformers/all-MiniLM-L6-v2