---
title: BabelCast Mistral
emoji: 🎙️
colorFrom: blue
colorTo: purple
sdk: docker
pinned: false
license: mit
---

# BabelCast — Mistral 7B Translation Pipeline

Real-time speech translation pipeline: **Whisper large-v3-turbo** (STT) + **Mistral 7B Instruct Q5_K_M** (Translation) + **Qwen3-TTS** (Voice Dubbing).

## Stack

| Component | Model | Purpose |
|-----------|-------|---------|
| STT | `faster-whisper` large-v3-turbo | Speech-to-text |
| LLM | Mistral 7B Instruct v0.3 (GGUF Q5_K_M) | Translation via llama.cpp |
| TTS | Qwen3-TTS 0.6B with CUDA graphs | Voice dubbing |

## API Endpoints

| Method | Path | Description |
|--------|------|-------------|
| `GET` | `/health` | Health check |
| `POST` | `/v1/transcribe` | Audio -> text (STT) |
| `POST` | `/v1/translate/text` | Text translation |
| `POST` | `/v1/translate` | Audio -> translated text (STT + LLM) |
| `POST` | `/v1/tts` | Text -> speech |
| `POST` | `/v1/speech` | Full pipeline: audio -> STT -> translate -> TTS |

## Requirements

- NVIDIA GPU with 16+ GB VRAM (RTX 4090 recommended)
- CUDA 12.4+

## Run locally

```bash
docker build -t babelcast-mistral .
docker run --gpus all -p 8000:8000 babelcast-mistral
```

## Docker Hub

```bash
docker pull marcosremar/babelcast-mistral:latest
docker run --gpus all -p 8000:8000 marcosremar/babelcast-mistral:latest
```

## Deploy on GPU cloud

Works on RunPod, Vast.ai, TensorDock, or any Docker-capable GPU host.
Models are downloaded on first boot (~5 min with fast internet).