Introduction
Voice SDK is a containerized, end-to-end voice platform from Trouve Labs. Build production-grade audio and speech systems — transcription, synthesis, noise suppression, speaker biometrics, and conversational AI — from a single self-hosted stack.
Built and maintained by the AVML (Audio, Voice & Machine Learning) team. Live at voiceai.trouve.works.
What Voice SDK is
Instead of stitching together fragmented APIs from OpenAI, Google Cloud, Azure, and others, Voice SDK delivers the full voice pipeline as a single, modular SDK. Every module ships as a Docker container that runs anywhere — local, cloud, or edge — and exposes both a standalone REST/WebSocket API and a unified web interface.
Core pipeline
Audio Input → Processing → Understanding → Response → Monitoring
Five stages, four modules, one SDK. Use the modules independently, or compose them into a complete agent loop.
Why teams adopt it
| Advantage | What it means in practice |
|---|---|
| One SDK | Replaces multiple fragmented vendor APIs with a unified self-hosted stack |
| Container-first | Every component is a Docker image. Same artifact runs on a workstation, a cloud node, or an edge device |
| Self-hosted | No data leaves your infrastructure. Every model — STT, TTS, LLM, embedding, diarization — runs locally |
| GPU-accelerated | NVIDIA CUDA 12.x throughout, with multi-GPU allocation across services |
| Streaming | Real-time WebSocket / WebRTC paths for every interactive module |
| OpenAI-compatible | Drop-in replacement for the OpenAI audio transcription and speech endpoints |
Modules
| Module | Purpose | Path |
|---|---|---|
| Voice Agent | End-to-end conversational AI over WebRTC | / |
| Voice Utilities | Transcription and speech synthesis studio | /utilities/ |
| Noise Suppression | DeepFilterNet3 audio cleanup, file and real-time | /noise/ |
| Voice Biometrics | Speaker identification and diarization | /biometric/ |
Where to next
- Prerequisites — what you need installed
- Installation — clone, configure, run the containers
- Quick start — your first transcription, synthesis, and conversation
- Architecture — service topology and URL routing
- Modules — per-module deep dives
- About Trouve Labs — who builds this