Real-time WebSocket

Bidirectional WebSocket API for voice agents and LLM integration with intelligent text buffering.

When to Use WebSocket

✓ Ideal For

  • Voice agents / assistants
  • LLM streaming output (ChatGPT-style)
  • Real-time translation
  • Interactive phone systems (IVR)
  • Any bidirectional communication

Consider SSE Instead

  • Simple demos / previews
  • One-shot generation
  • Mobile apps (simpler integration)
  • No text buffering needed

Connection Methods

Choose the right method for your use case:

MethodBidirectionalBest For
HTTP BatchNoAudiobooks, bulk generation
SSE StreamingNoDemos, previews, mobile
WebSocketYesVoice agents, LLM integration

Note

Text Buffering: WebSocket uniquely supports intelligent text buffering — accumulate tokens from an LLM and generate at natural sentence boundaries for smoother speech.

Endpoint

wss://api.murmr.dev/v1/realtime

After connecting, send a config message with your API key within 10 seconds. Time-to-first-chunk is typically ~460ms server-side.

Plan Requirement

WebSocket is available on Realtime and Scale plans. See Pricing for plan details.

Quick Example

Basic WebSocket flow with VoiceDesign:

// 1. Connect
const ws = new WebSocket('wss://api.murmr.dev/v1/realtime');

// 2. Configure session
ws.onopen = () => {
  ws.send(JSON.stringify({
    type: 'config',
    api_key: 'YOUR_API_KEY',
    voice_description: 'A warm, friendly voice',
    language: 'English'
  }));
};

// 3. Handle messages
ws.onmessage = (event) => {
  const msg = JSON.parse(event.data);

  if (msg.type === 'config_ack') {
    // Ready to send text
    ws.send(JSON.stringify({ type: 'text', text: 'Hello!' }));
    ws.send(JSON.stringify({ type: 'flush' }));
  }

  if (msg.type === 'audio') {
    // msg.chunk is base64 PCM audio
    playAudioChunk(msg.chunk, msg.sample_rate);
  }

  if (msg.type === 'done') {
    console.log(`TTFC: ${msg.first_chunk_latency_ms}ms`);
  }
};

Key Features

Text Buffering
Smart sentence boundary detection
Binary Mode
Raw PCM for lower latency
API Key Caching
Faster auth after first request
Low Latency
Fast time-to-first-chunk

Learn More

Rate Limits

WebSocket connections are subject to these limits:

10
Concurrent connections per API key
100
Generations per minute per key
Plan
Character usage counts against quota