New: the Voice AI Investors list release! Check it out

    voicebox

    Git Repo
    jamiepine

    Voicebox is an open-source, local voice synthesis studio powered by Qwen3-TTS, enabling voice cloning, speech generation, and voice-powered app creation.

    About voicebox

    This repository introduces Voicebox, an open-source voice synthesis studio powered by Qwen3-TTS, designed for local voice cloning and speech generation.

    For the Non-Technical Reader

    Imagine you want to create a personalized audiobook using your own voice, or perhaps develop a unique voice assistant that sounds just like a family member. Voicebox allows you to clone voices from just a few seconds of audio and generate speech, all on your local machine. It's like having a professional voice-over studio at your fingertips, without the need for expensive cloud services or concerns about data privacy. You can create multi-voice stories, podcasts, or even integrate custom voices into your applications.

    For the Technical Reader

    Voicebox leverages Alibaba's Qwen3-TTS model for voice cloning, achieving high fidelity with natural prosody and cadence. The application is built with Tauri (Rust) for native performance and features an MLX backend for Metal acceleration on Apple Silicon, resulting in 4-5x faster inference. Key features include:

    • Instant voice cloning from short audio samples

    • Voice profile management with import/export capabilities

    • Multi-track timeline editor for composing multi-voice projects

    • In-app recording and transcription

    Currently, Voicebox supports macOS and Windows, with Linux builds planned. The roadmap includes support for XTTS, Bark, and other models.

    Why It Matters

    Voicebox champions privacy by keeping voice data and models local, contrasting with cloud-based services. Its open-source nature promotes community development and customization, reducing reliance on proprietary solutions. The potential for cost savings is significant, as users avoid subscription fees associated with cloud-based voice cloning services.

    The "Voice AI Space Lab" Idea

    Imagine building a "Storytime Creator" app for kids. Parents could clone their voice and generate personalized bedtime stories with different characters and scenarios, all powered by Voicebox and running locally on a tablet.