VLLM support

#1
by deeksha5 - opened

Is this model supported by VLLM

Shunya Labs org

Yes, the model can be served using VLLM provided it supports the transcription task and is compatible with the OpenAI API format.

Serving the Model

vllm serve shunyalabs/zero-stt-hinglish \
  --task transcription \
  --gpu-memory-utilization 0.5

Python Client Example

from openai import OpenAI

client = OpenAI(
    api_key="EMPTY",
    base_url="http://localhost:8000/v1",
)

with open("audio.wav", "rb") as f:
    transcription = client.audio.transcriptions.create(
        file=f,
        model="shunyalabs/zero-stt-hinglish",
        response_format="text",
        language="hi",
        temperature=0.0
    )

print(transcription)

Notes

  • The VLLM server must be running at http://localhost:8000/v1
  • The model support the transcription task
  • The API follows OpenAI-compatible request/response format

ayush-shunyalabs changed discussion status to closed

Sign up or log in to comment