VLLM support
#1
by deeksha5 - opened
Is this model supported by VLLM
Yes, the model can be served using VLLM provided it supports the transcription task and is compatible with the OpenAI API format.
Serving the Model
vllm serve shunyalabs/zero-stt-hinglish \
--task transcription \
--gpu-memory-utilization 0.5
Python Client Example
from openai import OpenAI
client = OpenAI(
api_key="EMPTY",
base_url="http://localhost:8000/v1",
)
with open("audio.wav", "rb") as f:
transcription = client.audio.transcriptions.create(
file=f,
model="shunyalabs/zero-stt-hinglish",
response_format="text",
language="hi",
temperature=0.0
)
print(transcription)
Notes
- The VLLM server must be running at
http://localhost:8000/v1 - The model support the
transcriptiontask - The API follows OpenAI-compatible request/response format
ayush-shunyalabs changed discussion status to closed