Skip to main content

Installation

Voice SDK ships as a set of Docker services. Each module has its own compose file and can be brought up independently, or all together behind a single Nginx reverse proxy.

Option A — single service

Run only the module you need. Each one exposes a standalone REST/WebSocket API.

Speaches (STT/TTS)

cd speaches
docker compose -f compose.cuda.yaml up -d # GPU
# docker compose -f compose.yaml up -d # CPU fallback

Service is on port 8051 and serves OpenAI-compatible /v1/audio/transcriptions and /v1/audio/speech.

Noise Suppression

cd NoiseSuppression/backend
cp .env.example .env # edit GPU_DEVICE
pip install -r requirements.txt
python main.py # runs on :8060

cd ../frontend
npm install && npm run dev # runs on :8061

Voice Biometrics

cd VoiceBiometrics
docker compose up -d # PostgreSQL + API on :8066

The compose file boots PostgreSQL (:8065) and the FastAPI service (:8066) with GPU 4 reserved.

Voice Agent

# Backend agent
cd agent/agent_scratch
pip install -r requirements.txt
python agent.py dev # connects to LiveKit

# Frontend
cd conversationalai
npm install && npm run dev

Option B — full platform

Bring up every module behind one Nginx reverse proxy at voiceai.trouve.works. The proxy dispatches by path:

PathService
/Voice Agent (Next.js frontend)
/livekitLiveKit Server (WebRTC) — port 7880
/services/Speaches (STT/TTS) — port 8051
/utilities/VoiceUtilities (static frontend) — port 8112
/noise/Noise Suppression frontend — port 8061
/noise/api/Noise Suppression backend — port 8060
/noise/ws/Noise Suppression WebSocket — port 8060
/biometric/Voice Biometrics frontend
/biometric/api/Voice Biometrics backend — port 8066

See Deployment for the full topology, GPU allocation, and reverse-proxy configuration.

Verifying the install

# STT/TTS health
curl https://voiceai.trouve.works/services/health

# Noise suppression health
curl https://voiceai.trouve.works/noise/api/health
# → {"status":"ok","deepfilter_device":"cuda:2"}

# Biometrics health
curl https://voiceai.trouve.works/biometric/api/health

If all three return 200, the platform is up.

What lives where

AssetLocation
Models cache (HuggingFace + Torch)/storage/models/
Speaker enrollments/storage/enrollments/{speaker_id}/
Job artifacts (uploads + extracted segments)/storage/jobs/{job_id}/
PostgreSQL data volumepgdata (Docker volume)

For first-time installs, expect 50–200 GB of model downloads on the first request to each service.