Installation

Voice SDK ships as a set of Docker services. Each module has its own compose file and can be brought up independently, or all together behind a single Nginx reverse proxy.

Option A — single service

Run only the module you need. Each one exposes a standalone REST/WebSocket API.

Speaches (STT/TTS)

cd speaches
docker compose -f compose.cuda.yaml up -d   # GPU
# docker compose -f compose.yaml up -d      # CPU fallback

Service is on port 8051 and serves OpenAI-compatible /v1/audio/transcriptions and /v1/audio/speech.

Noise Suppression

cd NoiseSuppression/backend
cp .env.example .env          # edit GPU_DEVICE
pip install -r requirements.txt
python main.py                # runs on :8060

cd ../frontend
npm install && npm run dev    # runs on :8061

Voice Biometrics

cd VoiceBiometrics
docker compose up -d          # PostgreSQL + API on :8066

The compose file boots PostgreSQL (:8065) and the FastAPI service (:8066) with GPU 4 reserved.

Voice Agent

# Backend agent
cd agent/agent_scratch
pip install -r requirements.txt
python agent.py dev           # connects to LiveKit

# Frontend
cd conversationalai
npm install && npm run dev

Option B — full platform

Bring up every module behind one Nginx reverse proxy at voiceai.trouve.works. The proxy dispatches by path:

Path	Service
`/`	Voice Agent (Next.js frontend)
`/livekit`	LiveKit Server (WebRTC) — port 7880
`/services/`	Speaches (STT/TTS) — port 8051
`/utilities/`	VoiceUtilities (static frontend) — port 8112
`/noise/`	Noise Suppression frontend — port 8061
`/noise/api/`	Noise Suppression backend — port 8060
`/noise/ws/`	Noise Suppression WebSocket — port 8060
`/biometric/`	Voice Biometrics frontend
`/biometric/api/`	Voice Biometrics backend — port 8066

See Deployment for the full topology, GPU allocation, and reverse-proxy configuration.

Verifying the install

# STT/TTS health
curl https://voiceai.trouve.works/services/health

# Noise suppression health
curl https://voiceai.trouve.works/noise/api/health
# → {"status":"ok","deepfilter_device":"cuda:2"}

# Biometrics health
curl https://voiceai.trouve.works/biometric/api/health

If all three return 200, the platform is up.

What lives where

Asset	Location
Models cache (HuggingFace + Torch)	`/storage/models/`
Speaker enrollments	`/storage/enrollments/{speaker_id}/`
Job artifacts (uploads + extracted segments)	`/storage/jobs/{job_id}/`
PostgreSQL data volume	`pgdata` (Docker volume)

For first-time installs, expect 50–200 GB of model downloads on the first request to each service.

Option A — single service​

Speaches (STT/TTS)​

Noise Suppression​

Voice Biometrics​

Voice Agent​

Option B — full platform​

Verifying the install​

What lives where​