cartesinus/iva_mt_wslot
Updated • 24
How to use cartesinus/iva_mt_wslot-m2m100_418M-en-fr with Transformers:
# Use a pipeline as a high-level helper
# Warning: Pipeline type "translation" is no longer supported in transformers v5.
# You must load the model directly (see below) or downgrade to v4.x with:
# 'pip install "transformers<5.0.0'
from transformers import pipeline
pipe = pipeline("translation", model="cartesinus/iva_mt_wslot-m2m100_418M-en-fr") # Load model directly
from transformers import AutoTokenizer, AutoModelForMultimodalLM
tokenizer = AutoTokenizer.from_pretrained("cartesinus/iva_mt_wslot-m2m100_418M-en-fr")
model = AutoModelForMultimodalLM.from_pretrained("cartesinus/iva_mt_wslot-m2m100_418M-en-fr")This model is a fine-tuned version of facebook/m2m100_418M on the iva_mt_wslot dataset. It achieves the following results on the evaluation set:
More information needed
First please make sure to install pip install transformers. First download model:
from transformers import M2M100ForConditionalGeneration, M2M100Tokenizer
import torch
def translate(input_text, lang):
input_ids = tokenizer(input_text, return_tensors="pt")
generated_tokens = model.generate(**input_ids, forced_bos_token_id=tokenizer.get_lang_id(lang))
return tokenizer.batch_decode(generated_tokens, skip_special_tokens=True)
model_name = "cartesinus/iva_mt_wslot-m2m100_418M-0.1.0-en-fr"
tokenizer = M2M100Tokenizer.from_pretrained(model_name, src_lang="en", tgt_lang="fr")
model = M2M100ForConditionalGeneration.from_pretrained(model_name)
Then you can translate either plain text like this:
print(translate("set the temperature on my thermostat", "fr"))
or you can translate with slot annotations that will be restored in tgt language:
print(translate("wake me up at <a>nine am<a> on <b>friday<b>", "fr"))
Limitations of translation with slot transfer:
The following hyperparameters were used during training:
| Training Loss | Epoch | Step | Validation Loss | Bleu | Gen Len |
|---|---|---|---|---|---|
| 0.0132 | 1.0 | 1700 | 0.0110 | 68.7161 | 21.6874 |
| 0.0083 | 2.0 | 3400 | 0.0093 | 70.3712 | 21.9443 |
| 0.006 | 3.0 | 5100 | 0.0093 | 71.5485 | 21.995 |
| 0.0044 | 4.0 | 6800 | 0.0091 | 71.2971 | 21.8371 |
| 0.0032 | 5.0 | 8500 | 0.0093 | 71.9252 | 21.9268 |
| 0.0026 | 6.0 | 10200 | 0.0094 | 72.2756 | 21.9543 |
| 0.002 | 7.0 | 11900 | 0.0094 | 72.5602 | 21.9543 |
If you use this model, please cite the following:
@article{Sowanski2023SlotLI,
title={Slot Lost in Translation? Not Anymore: A Machine Translation Model for Virtual Assistants with Type-Independent Slot Transfer},
author={Marcin Sowanski and Artur Janicki},
journal={2023 30th International Conference on Systems, Signals and Image Processing (IWSSIP)},
year={2023},
pages={1-5}
}
Base model
facebook/m2m100_418M