POST/v1/audio/speech

Saved Voices API

Generate speech using voices you've saved from VoiceDesign. Guaranteed consistency across all your content.

Creating Saved Voices

First, create a voice with VoiceDesign, then save it using POST /api/v1/voices. Use the returned voice ID (e.g., voice_abc123) with this endpoint.

Endpoints

POST/v1/audio/speech

Batch endpoint — returns 202 with job ID, poll for audio

POST/v1/audio/speech/stream

Streaming endpoint — low-latency SSE chunks

Different response models

Batch (/v1/audio/speech): Always returns 202 Accepted with a job ID. Poll GET /v1/jobs/{jobId} to retrieve the audio. Supports response_format and webhook_url.
Streaming (/v1/audio/speech/stream): Returns SSE with base64 PCM chunks for real-time playback.

Building a real-time app?

The batch endpoint is optimized for bulk generation and file exports, not low-latency playback. If you need audio as fast as possible (voice agents, live previews, interactive apps), use /v1/audio/speech/stream instead — it delivers first audio in ~450ms.

Request Parameters

Send a JSON body with the following parameters:

Parameter	Type	Default	Description
`text`required	string	—	The text to synthesize. Maximum 4,096 characters.
`voice`	string	—	Saved voice ID (e.g., "voice_abc123"). The API resolves this to the stored voice embeddings automatically. Either voice or voice_clone_prompt is required.
`voice_clone_prompt`	string	—	Base64-encoded voice embedding data. Optional — only needed if you store and manage embeddings yourself. Most users should use voice instead.
`language`	string	Auto	Output language: English, Spanish, Portuguese, German, French, Italian, Chinese, Japanese, Korean, Russian, or "Auto"
`response_format`	string	mp3	Audio format: mp3 (default), opus, aac, flac, wav, or pcm. See Audio Formats.
`webhook_url`	string	—	HTTPS URL for async delivery (batch endpoint only). When provided, completed audio is POSTed to this URL. See Async Jobs.
`input`	string	—	Alias for text (OpenAI API compatibility). If both are provided, text takes precedence.

Format your text for better prosody

Newlines in your input text act as pause cues — \n creates a sentence-level breath pause, \n\n creates a paragraph-level pause. See the Text Formatting Guide for best practices on natural-sounding output.

Batch Example

The batch endpoint returns a 202 Accepted response with a job ID. Poll for the audio result.

# Submit batch job
curl -X POST "https://api.murmr.dev/v1/audio/speech" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "text": "Welcome back to episode 2 of our podcast.",
    "voice": "voice_abc123",
    "response_format": "mp3"
  }'
# Returns: {"id":"job_a1b2c3d4e5f6g7h8","status":"queued","created_at":"..."}

# Poll for result (returns binary audio when complete)
curl "https://api.murmr.dev/v1/jobs/job_a1b2c3d4e5f6g7h8" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  --output episode2.mp3

Streaming Example

For low-latency playback, use the streaming endpoint. Audio arrives as Server-Sent Events with base64 PCM chunks.

curl -X POST "https://api.murmr.dev/v1/audio/speech/stream" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "text": "Hello, this is streaming audio.",
    "voice": "voice_abc123"
  }'

See SSE Streaming for complete integration guide with Web Audio API playback.

OpenAI Compatibility

This endpoint is compatible with OpenAI's /v1/audio/speech API. The input parameter is accepted as an alias for text. See the OpenAI Migration Guide for details.

Error Responses

JSON

{
  "error": "Saved voice not found: voice_abc123"
}

400

Bad Request

Missing text, text too long (>4096 chars), invalid response_format

401

Unauthorized

Missing or invalid API key

404

Not Found

Saved voice not found or doesn't belong to your account

429

Rate Limit Exceeded

Monthly character quota exceeded, or server at capacity

See Error Reference for complete error handling guidance.