--- license: apache-2.0 tags: - audio - speech - tokenizer - vocoder - wavcoch library_name: transformers --- # WavCochCausalV64000100M **WavCoch** is a causal waveform-to-cochleagram tokenizer by **Greta Tuckute** and **Klemen Kotar**. ## Model Details | Parameter | Value | |-----------|-------| | Parameters | ~93.05M | | Window Size | 1001 | | Hop Length | 80 | | Encoder Dim | 1536 | | Vocabulary Size | 64000 | | Includes Vocoder | False | ## Usage ```python from transformers import AutoModel wavcoch = AutoModel.from_pretrained( "TuKoResearch/WavCochCausalV64000100M", trust_remote_code=True, ) codes = wavcoch.quantize(waveform_tensor) coch = wavcoch.decode(codes) embeddings = wavcoch( input_values=waveform_tensor, output_hidden_states=True, sampling_rate=16000, ).hidden_states[0] ``` ## Notes This repo contains the WavCoch tokenizer/autoencoder only. Audio decoding requires a vocoder-enabled checkpoint. When called with `output_hidden_states=True`, WavCoch exposes a single hidden-state layer: the post-FSQ projected embedding sequence used for direct probing.