Spaces:
Runtime error
Runtime error
metadata
title: BabelCast Mistral
emoji: 🎙️
colorFrom: blue
colorTo: purple
sdk: docker
pinned: false
license: mit
BabelCast — Mistral 7B Translation Pipeline
Real-time speech translation pipeline: Whisper large-v3-turbo (STT) + Mistral 7B Instruct Q5_K_M (Translation) + Qwen3-TTS (Voice Dubbing).
Stack
| Component | Model | Purpose |
|---|---|---|
| STT | faster-whisper large-v3-turbo |
Speech-to-text |
| LLM | Mistral 7B Instruct v0.3 (GGUF Q5_K_M) | Translation via llama.cpp |
| TTS | Qwen3-TTS 0.6B with CUDA graphs | Voice dubbing |
API Endpoints
| Method | Path | Description |
|---|---|---|
GET |
/health |
Health check |
POST |
/v1/transcribe |
Audio -> text (STT) |
POST |
/v1/translate/text |
Text translation |
POST |
/v1/translate |
Audio -> translated text (STT + LLM) |
POST |
/v1/tts |
Text -> speech |
POST |
/v1/speech |
Full pipeline: audio -> STT -> translate -> TTS |
Requirements
- NVIDIA GPU with 16+ GB VRAM (RTX 4090 recommended)
- CUDA 12.4+
Run locally
docker build -t babelcast-mistral .
docker run --gpus all -p 8000:8000 babelcast-mistral
Docker Hub
docker pull marcosremar/babelcast-mistral:latest
docker run --gpus all -p 8000:8000 marcosremar/babelcast-mistral:latest
Deploy on GPU cloud
Works on RunPod, Vast.ai, TensorDock, or any Docker-capable GPU host. Models are downloaded on first boot (~5 min with fast internet).