Modules | Voice SDK

📄️Voice Agent

A fully self-hosted, end-to-end conversational voice AI system. Users speak naturally; the agent listens, understands, thinks, and responds with voice in real time.

Speech-to-text and text-to-speech services, available as APIs and through the VoiceLab web interface ("AI Speech Studio"). Three components: the VoiceLab frontend, the Speaches containerized backend, and the STT test adapter for arbitrary HuggingFace models.

📄️Noise Suppression

Removes background noise from audio — file-based or real-time over a microphone stream. Powered by DeepFilterNet3, a state-of-the-art deep learning noise suppression model.

📄️Voice Biometrics

Identify who is speaking in an audio recording — voice recognition for speech, conceptually similar to face recognition for images. Provides both speaker identification (matching voices against an enrolled gallery) and speaker diarization ("who spoke when").

📄️Voice Agent

📄️Voice Utilities

📄️Noise Suppression

📄️Voice Biometrics