Skip to main content

Use cases

Common compositions of Voice SDK modules. Each row maps a real workload to the modules that make it possible.

Use caseModules
Call center AIVoice Agent + Noise Suppression + Voice Biometrics
Meeting transcriptionSTT + Diarization + Language detection
Voice authenticationVoice Biometrics + Speaker ID
Podcast / audio cleanupNoise Suppression
Multilingual voice assistantVoice Agent (full pipeline)
Accessibility toolsSTT + TTS
Speaker analyticsVoice Biometrics + Diarization

Patterns worth highlighting

Denoise before transcribe

For call-center and field-recorded audio, route through Noise Suppression first — DeepFilterNet3 cleanup measurably improves Whisper accuracy on noisy inputs without changing the rest of the pipeline.

Diarize, then identify

For meeting recordings, run /diarize to get the speaker count and segment timing, then enroll the participants you care about and run /identify against the same recording. This separates who is talking from who that person is — useful when the participant list is partially known.

One agent, multiple frontends

A single Voice Agent backend serves React web, iOS/macOS, Flutter, React Native, Android, web embeds, and telephony integrations through LiveKit. Build the conversation logic once; ship it everywhere.

OpenAI client, self-hosted endpoints

Because Speaches exposes OpenAI-compatible STT and TTS endpoints, existing applications can migrate from OpenAI to self-hosted by changing only base_url. No client rewrite required.