Audio Formats
murmr supports six audio output formats via the response_format parameter. All formats are available on all plans.
Supported Formats
Set response_format in your request body. Defaults to mp3 (via Worker; wav on pod direct).
| Format | Codec | Bitrate | Content-Type | Best For |
|---|---|---|---|---|
| wav | — | Lossless | audio/wav | Highest quality, editing |
| pcm | — | Raw | audio/pcm | Direct playback, real-time pipelines |
| mp3 | libmp3lame | 128k | audio/mpeg | Default, downloads, broadest compatibility |
| opus | libopus | 64k | audio/opus | Low bandwidth, WebRTC, voice agents |
| aac | aac | 64k | audio/aac | iOS/Safari, mobile apps |
| flac | flac | Lossless | audio/flac | Archival, lossless compression |
Batch Only
- Batch endpoint (
/v1/audio/speech) supports all 6 formats via theresponse_formatparameter. Format conversion uses ffmpeg server-side. - Streaming endpoints (
/stream) always return raw PCM chunks (base64 encoded in SSE JSON), regardless ofresponse_format.
Requesting a Format
curl -X POST "https://api.murmr.dev/v1/audio/speech" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"text": "Hello, world!",
"voice": "voice_abc123",
"response_format": "mp3"
}' --output output.mp3
# Returns 200 with binary audio (mp3 default)Native Audio Specifications
All audio is generated at these native specifications before encoding:
| Property | Value |
|---|---|
| Sample Rate | 24,000 Hz |
| Bit Depth | 16-bit signed |
| Channels | Mono (1) |
| Encoding | PCM (pcm_s16le) |
| Byte Order | Little-endian |
PCM Format Details
Raw PCM audio is signed 16-bit integers in little-endian byte order:
Each sample: 2 bytes (16 bits)
- Range: -32768 to +32767
- Little-endian: low byte first, high byte second
Example (silence): 0x00 0x00
Example (max positive): 0xFF 0x7F = +32767
Example (max negative): 0x00 0x80 = -32768
Duration calculation:
duration_seconds = num_bytes / (sample_rate * 2)
duration_seconds = num_bytes / 48000Base64 Decoding
Streaming endpoints return PCM data as base64-encoded strings in JSON. Decode to raw bytes:
// Decode base64 to PCM Int16Array
function decodeBase64PCM(base64: string): Int16Array {
const binaryString = atob(base64);
const bytes = new Uint8Array(binaryString.length);
for (let i = 0; i < binaryString.length; i++) {
bytes[i] = binaryString.charCodeAt(i);
}
// Convert to Int16Array (2 bytes per sample)
return new Int16Array(bytes.buffer);
}
// Usage with SSE chunk
const data = JSON.parse(event.data.slice(6)); // Remove "data: "
const pcmSamples = decodeBase64PCM(data.chunk);
// Convert to Float32 for Web Audio API (-1.0 to +1.0)
const float32 = new Float32Array(pcmSamples.length);
for (let i = 0; i < pcmSamples.length; i++) {
float32[i] = pcmSamples[i] / 32768;
}Converting PCM to WAV
To save streaming audio as a WAV file, prepend a 44-byte header to the raw PCM data:
function pcmToWav(pcmData: ArrayBuffer, sampleRate = 24000): Blob {
const numChannels = 1;
const bitsPerSample = 16;
const byteRate = sampleRate * numChannels * bitsPerSample / 8;
const blockAlign = numChannels * bitsPerSample / 8;
const dataSize = pcmData.byteLength;
// Create 44-byte WAV header
const header = new ArrayBuffer(44);
const view = new DataView(header);
// "RIFF" chunk descriptor
writeString(view, 0, 'RIFF');
view.setUint32(4, 36 + dataSize, true); // File size - 8
writeString(view, 8, 'WAVE');
// "fmt " sub-chunk
writeString(view, 12, 'fmt ');
view.setUint32(16, 16, true); // Subchunk1 size (16 for PCM)
view.setUint16(20, 1, true); // Audio format (1 = PCM)
view.setUint16(22, numChannels, true);
view.setUint32(24, sampleRate, true);
view.setUint32(28, byteRate, true);
view.setUint16(32, blockAlign, true);
view.setUint16(34, bitsPerSample, true);
// "data" sub-chunk
writeString(view, 36, 'data');
view.setUint32(40, dataSize, true);
return new Blob([header, pcmData], { type: 'audio/wav' });
}
function writeString(view: DataView, offset: number, str: string) {
for (let i = 0; i < str.length; i++) {
view.setUint8(offset + i, str.charCodeAt(i));
}
}
// Usage: Concatenate all PCM chunks, then convert
const allPcmData = concatenateArrayBuffers(pcmChunks);
const wavBlob = pcmToWav(allPcmData, 24000);
// Download
const url = URL.createObjectURL(wavBlob);
const a = document.createElement('a');
a.href = url;
a.download = 'audio.wav';
a.click();Web Audio API Playback
For real-time playback in the browser, schedule PCM chunks with the Web Audio API:
const audioContext = new AudioContext();
let nextStartTime = audioContext.currentTime;
function playPCMChunk(pcmSamples: Float32Array, sampleRate: number) {
// Create audio buffer
const buffer = audioContext.createBuffer(1, pcmSamples.length, sampleRate);
buffer.getChannelData(0).set(pcmSamples);
// Create and schedule source
const source = audioContext.createBufferSource();
source.buffer = buffer;
source.connect(audioContext.destination);
// Schedule playback (gapless)
const startTime = Math.max(nextStartTime, audioContext.currentTime);
source.start(startTime);
// Update next start time
nextStartTime = startTime + buffer.duration;
}
// Important: Convert Int16 PCM to Float32 first
function int16ToFloat32(pcm: Int16Array): Float32Array {
const float = new Float32Array(pcm.length);
for (let i = 0; i < pcm.length; i++) {
float[i] = pcm[i] / 32768;
}
return float;
}Note
User gesture required: AudioContext must be created or resumed after a user interaction (click, tap) due to browser autoplay policies.
File Size Calculation
Estimate audio file sizes for capacity planning:
PCM bytes = duration_seconds × sample_rate × bytes_per_sample
PCM bytes = duration_seconds × 24000 × 2
PCM bytes = duration_seconds × 48000
WAV bytes = PCM bytes + 44 (header)
Examples:
1 second → 48 KB
10 seconds → 480 KB
1 minute → 2.88 MB
10 minutes → 28.8 MBSee Also
- Saved Voices API — Generate speech with response_format
- SSE Streaming — Receive PCM chunks via Server-Sent Events
- WebSocket Protocol — Real-time audio with binary mode