Voice
The Voice API lives alongside Agent Core under /v1/agents/{agent_id}/voice/*. It unifies voice library listing, prompt-based voice design, audio sample cloning, TTS preview, and voice deletion.
List voices
Section titled “List voices”GET /v1/agents/{agent_id}/voice/libraryQuery params:
source(optional) — filter by origin:public,user_designed,user_trained, oruser_elevenlabs
Response 200:
{ "voices": [ { "voice_id": "21m00...", "name": "Rachel", "source": "public", "preview_url": null }, { "voice_id": "huv-1", "name": "Custom", "source": "user_designed", "preview_url": null }, { "voice_id": "huv-2", "name": "Santiago", "source": "user_trained", "preview_url": null }, { "voice_id": "el-abc", "name": "My Voice", "source": "user_elevenlabs", "preview_url": null } ], "total": 4}The user_elevenlabs source only appears when the agent has a connected ElevenLabs API key (see below).
ElevenLabs integration
Section titled “ElevenLabs integration”Users connect their own ElevenLabs API key to unlock their personal voice library and use their own usage quota. The key is validated against ElevenLabs /v1/user before being saved on the agent record.
Get status
Section titled “Get status”GET /v1/agents/{agent_id}/voice/integrations/elevenlabsResponse 200:
{ "connected": true, "masked_key": "***xyz1" }Connect (validate + save)
Section titled “Connect (validate + save)”POST /v1/agents/{agent_id}/voice/integrations/elevenlabsBody: {"api_key": "sk_..."}
Response 200: {"connected": true, "user": { "subscription": { "tier": "creator" } }} on success. 400 with invalid_key if ElevenLabs rejects the key.
Test a key without saving
Section titled “Test a key without saving”POST /v1/agents/{agent_id}/voice/integrations/elevenlabs/testBody: {"api_key": "sk_..."}
Response 200: {"valid": true | false}
Disconnect
Section titled “Disconnect”DELETE /v1/agents/{agent_id}/voice/integrations/elevenlabsReturns 204 No Content.
Voice design — 4-step prompt flow
Section titled “Voice design — 4-step prompt flow”Build a voice from a text description. Steps 1–3 are cheap (LLM calls only); step 4 persists.
1. Generate script
Section titled “1. Generate script”POST /v1/agents/{agent_id}/voice/design/generate-scriptBody: {"language": "en"} (optional)
Response 200: {"voice_sample": "Hi there, I'm your voice, and I'm going to tell you about..."}
2. Generate style description
Section titled “2. Generate style description”POST /v1/agents/{agent_id}/voice/design/generate-descriptionBody: {"gender": "male", "voice_description": "warm and confident", "language": "en", "script": "..."}
genderis requiredvoice_description,language,scriptare optional hints
Response 200: {"voice_description": "A warm, confident male voice with a measured pace..."}
3. Generate candidate samples
Section titled “3. Generate candidate samples”POST /v1/agents/{agent_id}/voice/design/generate-samplesBody: {"text": "...", "description": "..."} — text required (typically the script from step 1)
Response 200: {"samples": [{"generation_id": "g1", "audio": "base64..."}, {"generation_id": "g2", ...}, {"generation_id": "g3", ...}]}
Pick one of the three generation_id values and pass it to step 4. Treat it as opaque.
4. Finalize
Section titled “4. Finalize”POST /v1/agents/{agent_id}/voice/designBody:
{ "generation_id": "g2", "name": "Santiago", "gender": "male", "language": "en", "description": "warm, confident"}Response 201: The created voice resource. Now visible in the library under source: user_designed.
Voice cloning (synchronous)
Section titled “Voice cloning (synchronous)”Clone a voice from an existing audio sample. Returns the created voice immediately — no polling.
POST /v1/agents/{agent_id}/voice/cloneBody: Provide exactly one of audio_url or audio_base64.
{ "name": "Santiago Clone", "audio_url": "https://cdn.example.com/sample.mp3", "gender": "male", "language": "en", "description": "A warm, natural speaking voice"}Or with inline bytes for small clips:
{ "name": "Santiago Clone", "audio_base64": "SUQzBAAAAAAA...", "gender": "male"}audio_url is preferred for anything larger than a few seconds — the server downloads the file directly, which is faster and avoids base64 overhead.
Response 201: The created voice resource. Visible in the library under source: user_trained.
Delete a user-owned voice
Section titled “Delete a user-owned voice”DELETE /v1/agents/{agent_id}/voice/{voice_id}Returns 204 No Content. Only works on voices the agent owns (user_designed, user_trained, user_elevenlabs). Public voices cannot be deleted.
TTS preview (asynchronous)
Section titled “TTS preview (asynchronous)”Audition a voice saying specific text before assigning it via PATCH /core.
Start a preview
Section titled “Start a preview”POST /v1/agents/{agent_id}/voice/{voice_id}/previewBody: {"text": "Hello world"}
Response 202 Accepted: {"user_job_id": 138860}
This enqueues a UserJob on the GEN backend. The job charges credits and writes output audio as a ContentResource when complete.
Poll for completion
Section titled “Poll for completion”GET /v1/agents/{agent_id}/voice/preview/{job_id}Response 200: The full user_job record. Check the status field:
{ "id": 138860, "status": "completed", "output_resources": [ { "id": 99, "type": "audio", "file_url": "https://cdn.gen.pro/.../preview.mp3" } ]}Possible statuses: pending, processing, completed, failed.
Errors
Section titled “Errors”| Code | Meaning |
|---|---|
| 400 | Missing required field (e.g. text on preview, name on clone, gender on generate-description) |
| 400 | invalid_key on ElevenLabs connect — the key was rejected by ElevenLabs |
| 400 | Both audio_url and audio_base64 provided on clone (supply exactly one) |
| 401 | Missing or invalid X-API-Key |
| 404 | Voice not found (on delete or preview) |
See the Agent Core reference for identity, overview, personality, and other non-voice sections.