Real-time WebSocket

Bidirectional WebSocket API for voice agents and LLM integration with intelligent text buffering.

When to Use WebSocket

✓ Ideal For

Voice agents / assistants
LLM streaming output (ChatGPT-style)
Real-time translation
Interactive phone systems (IVR)
Any bidirectional communication

Consider SSE Instead

Simple demos / previews
One-shot generation
Mobile apps (simpler integration)
No text buffering needed

Connection Methods

Choose the right method for your use case:

Method	Bidirectional	Best For
HTTP Batch	No	Audiobooks, bulk generation
SSE Streaming	No	Demos, previews, mobile
WebSocket	Yes	Voice agents, LLM integration

Note

Text Buffering: WebSocket uniquely supports intelligent text buffering — accumulate tokens from an LLM and generate at natural sentence boundaries for smoother speech.

Endpoint

wss://api.murmr.dev/v1/realtime

After connecting, send a config message with your API key within 10 seconds. Time-to-first-chunk is typically ~460ms server-side.

Plan Requirement

WebSocket is available on Realtime and Scale plans. See Pricing for plan details.

Quick Example

Basic WebSocket flow with VoiceDesign:

// 1. Connect
const ws = new WebSocket('wss://api.murmr.dev/v1/realtime');

// 2. Configure session
ws.onopen = () => {
  ws.send(JSON.stringify({
    type: 'config',
    api_key: 'YOUR_API_KEY',
    voice_description: 'A warm, friendly voice',
    language: 'English'
  }));
};

// 3. Handle messages
ws.onmessage = (event) => {
  const msg = JSON.parse(event.data);

  if (msg.type === 'config_ack') {
    // Ready to send text
    ws.send(JSON.stringify({ type: 'text', text: 'Hello!' }));
    ws.send(JSON.stringify({ type: 'flush' }));
  }

  if (msg.type === 'audio') {
    // msg.chunk is base64 PCM audio
    playAudioChunk(msg.chunk, msg.sample_rate);
  }

  if (msg.type === 'done') {
    console.log(`TTFC: ${msg.first_chunk_latency_ms}ms`);
  }
};