Voice Agent
A fully self-hosted, end-to-end conversational voice AI system. Users speak naturally; the agent listens, understands, thinks, and responds with voice in real time.
Voice Utilities
Speech-to-text and text-to-speech services, available as APIs and through the VoiceLab web interface ("AI Speech Studio"). Three components: the VoiceLab frontend, the Speaches containerized backend, and the STT test adapter for arbitrary HuggingFace models.
Noise Suppression
Removes background noise from audio โ file-based or real-time over a microphone stream. Powered by DeepFilterNet3, a state-of-the-art deep learning noise suppression model.
Voice Biometrics
Identify who is speaking in an audio recording โ voice recognition for speech, conceptually similar to face recognition for images. Provides both speaker identification (matching voices against an enrolled gallery) and speaker diarization ("who spoke when").