Silero VAD v5 โ€” ONNX

Voice activity detection for the speech-android SDK.

  • ~260K params, <1ms latency per chunk
  • Input: 512 samples (32ms @ 16kHz)
  • Output: speech probability [0, 1]

Tensors

Input Shape Type
input [1, 512] float32
state [2, 1, 128] float32
sr scalar int64
Output
output [1, 1] float32
stateN [2, 1, 128] float32

Downloads last month
27
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Collection including soniqo/Silero-VAD-v5-ONNX