Real-time WebSocket
Bidirectional WebSocket API for voice agents and LLM integration with intelligent text buffering.
When to Use WebSocket
✓ Ideal For
- Voice agents / assistants
- LLM streaming output (ChatGPT-style)
- Real-time translation
- Interactive phone systems (IVR)
- Any bidirectional communication
Consider SSE Instead
- Simple demos / previews
- One-shot generation
- Mobile apps (simpler integration)
- No text buffering needed
Connection Methods
Choose the right method for your use case:
| Method | Bidirectional | Best For |
|---|---|---|
| HTTP Batch | No | Audiobooks, bulk generation |
| SSE Streaming | No | Demos, previews, mobile |
| WebSocket | Yes | Voice agents, LLM integration |
Note
Text Buffering: WebSocket uniquely supports intelligent text buffering — accumulate tokens from an LLM and generate at natural sentence boundaries for smoother speech.
Endpoint
wss://api.murmr.dev/v1/realtimeAfter connecting, send a config message with your API key within 10 seconds. Time-to-first-chunk is typically ~460ms server-side.
Plan Requirement
WebSocket is available on Realtime and Scale plans. See Pricing for plan details.
Quick Example
Basic WebSocket flow with VoiceDesign:
// 1. Connect
const ws = new WebSocket('wss://api.murmr.dev/v1/realtime');
// 2. Configure session
ws.onopen = () => {
ws.send(JSON.stringify({
type: 'config',
api_key: 'YOUR_API_KEY',
voice_description: 'A warm, friendly voice',
language: 'English'
}));
};
// 3. Handle messages
ws.onmessage = (event) => {
const msg = JSON.parse(event.data);
if (msg.type === 'config_ack') {
// Ready to send text
ws.send(JSON.stringify({ type: 'text', text: 'Hello!' }));
ws.send(JSON.stringify({ type: 'flush' }));
}
if (msg.type === 'audio') {
// msg.chunk is base64 PCM audio
playAudioChunk(msg.chunk, msg.sample_rate);
}
if (msg.type === 'done') {
console.log(`TTFC: ${msg.first_chunk_latency_ms}ms`);
}
};Key Features
Text Buffering
Smart sentence boundary detection
Binary Mode
Raw PCM for lower latency
API Key Caching
Faster auth after first request
Low Latency
Fast time-to-first-chunk
Learn More
Rate Limits
WebSocket connections are subject to these limits:
10
Concurrent connections per API key
100
Generations per minute per key
Plan
Character usage counts against quota