Sentence Similarity
sentence-transformers
PyTorch
Rust
ONNX
Safetensors
OpenVINO
Transformers
English
bert
feature-extraction
Eval Results
text-embeddings-inference
Instructions to use sentence-transformers/all-MiniLM-L12-v2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use sentence-transformers/all-MiniLM-L12-v2 with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("sentence-transformers/all-MiniLM-L12-v2") sentences = [ "That is a happy person", "That is a happy dog", "That is a very happy person", "Today is a sunny day" ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [4, 4] - Transformers
How to use sentence-transformers/all-MiniLM-L12-v2 with Transformers:
# Load model directly from transformers import AutoTokenizer, AutoModel tokenizer = AutoTokenizer.from_pretrained("sentence-transformers/all-MiniLM-L12-v2") model = AutoModel.from_pretrained("sentence-transformers/all-MiniLM-L12-v2") - Inference
- Notebooks
- Google Colab
- Kaggle
max token length question
#3
by ColinKhan - opened
Hi!
I've been using this model trying to perform vector search.
Recently i notice the default max sequence length of it is 128, while on the page it says max sequence length is 256.
However on the lower part of the page, it says the model was trained with 128 token length.
So i'm not sure if it's ok to increase token length to 256, will this decrease the quality of vector because hyper parameters were trained with 128 token length?
And since it's in sentence-transformer library, max_sequence_length can even be set as 512. Can I also do this for this model?
Thanks!