New: the Voice AI Investors list release! Check it out

    voiceblender

    Git Repo
    VoiceBlender

    Programmable voice platform bridging SIP and WebRTC with multi-party mixing, recording, TTS, STT, and AI agents via REST APIs.

    About voiceblender

    VoiceBlender is a programmable voice platform designed to bridge the gap between traditional telephony (SIP), modern web communication (WebRTC), and the latest AI agents. It provides a unified API to manage complex voice workflows, including multi-party mixing, recording, and real-time speech processing.

    For the Non-Technical Reader

    Think of VoiceBlender as a universal smart switchboard for the AI era. In the past, connecting a phone call to a web browser or an AI brain was incredibly complex and expensive. VoiceBlender acts like a set of digital LEGO bricks, allowing businesses to easily connect phone lines, WhatsApp calls, and web apps to AI assistants. Whether you want to build a customer support bot that sounds human or a system that records and summarizes meetings automatically, this tool handles the heavy lifting of 'connecting the wires' so you can focus on the experience.

    For the Technical Reader

    VoiceBlender is a high-performance service written in Go that manages the intersection of VoIP and AI. Key architectural highlights include:

    • Protocol Support: SIP inbound/outbound (UDP/TLS) with RFC 4028 session timers and WebRTC via SDP offer/answer with trickle ICE.
    • Audio Engine: Multi-party mixing with mix-minus-self audio at configurable sample rates (up to 48 kHz).
    • AI Integration: Native support for pluggable AI agents including ElevenLabs, VAPI, Pipecat, and Deepgram, featuring mid-session context injection.
    • Telephony Features: RFC 4733 DTMF handling, Answering Machine Detection (AMD) using Goertzel frequency analysis, and SIP 183 early media support.
    • Observability: Real-time event delivery via Webhooks (HMAC-SHA256 signed) and a WebSocket event stream (VSI), complemented by Prometheus metrics.

    Why It Matters

    The industry is shifting from rigid, proprietary telephony stacks to AI-native communication. VoiceBlender matters because it provides an open-source, API-first alternative to expensive middleware. By supporting WhatsApp Business Calling alongside traditional SIP and WebRTC, it allows developers to build cross-platform voice applications that are not locked into a single provider's ecosystem, significantly reducing costs and increasing deployment flexibility.

    The Voice AI Space Lab Idea

    Imagine building a "Legacy-to-AI Bridge" for local community centers. You could use VoiceBlender to allow seniors to call in via traditional landlines (SIP) to a community "radio" room. An AI agent could moderate the discussion, while younger community members join via a sleek web interface (WebRTC). The AI could provide real-time transcription for the hearing impaired and automatically post a summary of the town hall discussion to a local news blog immediately after the call ends.

    Explore the repository here: https://github.com/VoiceBlender/voiceblender