VLLM support

by deeksha5 - opened 21 days ago

Discussion

deeksha5

21 days ago

Is this model supported by VLLM

ayush-shunyalabs

Shunya Labs org 21 days ago

Yes, the model can be served using VLLM provided it supports the transcription task and is compatible with the OpenAI API format.

Serving the Model

vllm serve shunyalabs/zero-stt-hinglish \
  --task transcription \
  --gpu-memory-utilization 0.5

Python Client Example

from openai import OpenAI

client = OpenAI(
    api_key="EMPTY",
    base_url="http://localhost:8000/v1",
)

with open("audio.wav", "rb") as f:
    transcription = client.audio.transcriptions.create(
        file=f,
        model="shunyalabs/zero-stt-hinglish",
        response_format="text",
        language="hi",
        temperature=0.0
    )

print(transcription)

Notes

The VLLM server must be running at http://localhost:8000/v1
The model support the transcription task
The API follows OpenAI-compatible request/response format

ayush-shunyalabs changed discussion status to closed 21 days ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment