YAML Metadata Error:"license" must be one of [apache-2.0, mit, openrail, bigscience-openrail-m, creativeml-openrail-m, bigscience-bloom-rail-1.0, bigcode-openrail-m, afl-3.0, artistic-2.0, bsl-1.0, bsd, bsd-2-clause, bsd-3-clause, bsd-3-clause-clear, c-uda, cc, cc0-1.0, cc-by-2.0, cc-by-2.5, cc-by-3.0, cc-by-4.0, cc-by-sa-3.0, cc-by-sa-4.0, cc-by-nc-2.0, cc-by-nc-3.0, cc-by-nc-4.0, cc-by-nd-4.0, cc-by-nc-nd-3.0, cc-by-nc-nd-4.0, cc-by-nc-sa-2.0, cc-by-nc-sa-3.0, cc-by-nc-sa-4.0, cdla-sharing-1.0, cdla-permissive-1.0, cdla-permissive-2.0, wtfpl, ecl-2.0, epl-1.0, epl-2.0, etalab-2.0, eupl-1.1, eupl-1.2, agpl-3.0, gfdl, gpl, gpl-2.0, gpl-3.0, lgpl, lgpl-2.1, lgpl-3.0, isc, h-research, intel-research, lppl-1.3c, ms-pl, apple-ascl, apple-amlr, mpl-2.0, odc-by, odbl, openmdw-1.0, openrail++, osl-3.0, postgresql, ofl-1.1, ncsa, unlicense, zlib, pddl, lgpl-lr, deepfloyd-if-license, fair-noncommercial-research-license, llama2, llama3, llama3.1, llama3.2, llama3.3, llama4, grok2-community, gemma, unknown, other, array]

Seed-X-PPO-7B

Introduction

We are excited to introduce Seed-X, a powerful series of open-source multilingual translation language models, including an instruction model, a reinforcement learning model, and a reward model. It pushes the boundaries of translation capabilities within 7 billion parameters.

https://huggingface.co/ByteDance-Seed/Seed-X-PPO-7B with ONNX weights.

Usage

Python with Optimum and ORT:

import os

import onnxruntime as ort
from optimum.onnxruntime import ORTModelForCausalLM
from transformers import AutoTokenizer, PretrainedConfig, GenerationConfig


def main():
    work_dir = "[huggingface/dir]"
    config = PretrainedConfig.from_pretrained(work_dir)
    gen_config = GenerationConfig.from_pretrained(work_dir)
    suffix = "_q4"

    model_path = os.path.join(work_dir, "onnx", f"model{suffix}.onnx")
    use_gpu = True
    providers = [
        ("CUDAExecutionProvider", {"device_id": 0})
    ] if use_gpu else []
    providers.append("CPUExecutionProvider")
    
    sess_options = ort.SessionOptions()
    ort_model = ort.InferenceSession(model_path, sess_options, providers=providers)
    llm_model = ORTModelForCausalLM(
        session=ort_model,
        config=config,
        generation_config=gen_config,
        use_io_binding=True,
        use_cache=True,
    )

    tokenizer = AutoTokenizer.from_pretrained(work_dir)
    prompt = "Translate the following English sentence into Spanish:\nYou are using a model of type mistral to instantiate a model of type . <es>"
    inputs = tokenizer([prompt], return_tensors="pt").to("cuda")

    print("Input prompt: ", prompt)

    generated_ids = llm_model.generate(
        **inputs,
        max_new_tokens=512,
        num_beams=4,
        do_sample=False,
        temperature=1.0,
    )

    predicts = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
    print("Output: ", "".join(predicts[len(prompt):]))
    return


if __name__ == "__main__":
    main()
Downloads last month
7
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Fhrozen/Seed-X-PPO-7B-ONNX

Quantized
(15)
this model