Instructions to use deutsche-telekom/gbert-large-paraphrase-euclidean with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use deutsche-telekom/gbert-large-paraphrase-euclidean with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("deutsche-telekom/gbert-large-paraphrase-euclidean") sentences = [ "Das ist eine glückliche Person", "Das ist ein glücklicher Hund", "Das ist eine sehr glückliche Person", "Heute ist ein sonniger Tag" ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [4, 4] - Transformers
How to use deutsche-telekom/gbert-large-paraphrase-euclidean with Transformers:
# Load model directly from transformers import AutoTokenizer, AutoModel tokenizer = AutoTokenizer.from_pretrained("deutsche-telekom/gbert-large-paraphrase-euclidean") model = AutoModel.from_pretrained("deutsche-telekom/gbert-large-paraphrase-euclidean") - setfit
How to use deutsche-telekom/gbert-large-paraphrase-euclidean with setfit:
from setfit import SetFitModel model = SetFitModel.from_pretrained("deutsche-telekom/gbert-large-paraphrase-euclidean") - Inference
- Notebooks
- Google Colab
- Kaggle
German BERT large paraphrase euclidean
This is a sentence-transformers model. It maps sentences & paragraphs (text) into a 1024 dimensional dense vector space. The model is intended to be used together with SetFit to improve German few-shot text classification. It has a sibling model called deutsche-telekom/gbert-large-paraphrase-cosine.
This model is based on deepset/gbert-large. Many thanks to deepset!
Training
Loss Function
We have used BatchHardSoftMarginTripletLoss with eucledian distance as the loss function:
train_loss = losses.BatchHardSoftMarginTripletLoss(
model=model,
distance_metric=BatchHardTripletLossDistanceFunction.eucledian_distance,
)
Training Data
The model is trained on a carefully filtered dataset of
deutsche-telekom/ger-backtrans-paraphrase.
We deleted the following pairs of sentences:
min_char_lenless than 15jaccard_similaritygreater than 0.3de_token_countgreater than 30en_de_token_countgreater than 30cos_simless than 0.85
Hyperparameters
- learning_rate: 5.5512022294147105e-06
- num_epochs: 7
- train_batch_size: 68
- num_gpu: ???
Evaluation Results
We use the NLU Few-shot Benchmark - English and German dataset to evaluate this model in a German few-shot scenario.
Qualitative results
- multilingual sentence embeddings provide the worst results
- Electra models also deliver poor results
- German BERT base size model (deepset/gbert-base) provides good results
- German BERT large size model (deepset/gbert-large) provides very good results
- our fine-tuned models (this model and deutsche-telekom/gbert-large-paraphrase-cosine) provide best results
Licensing
Copyright (c) 2023 Philip May, Deutsche Telekom AG
Copyright (c) 2022 deepset GmbH
Licensed under the MIT License (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License by reviewing the file LICENSE in the repository.
- Downloads last month
- 6,205
Model tree for deutsche-telekom/gbert-large-paraphrase-euclidean
Base model
deepset/gbert-large