neutts

This repository hosts NeuTTS, an on-device text-to-speech (TTS) model developed by Neuphonic.

For the Non-Technical Reader

Imagine having a personal voice assistant that lives directly on your phone, laptop, or even a small device like a Raspberry Pi. NeuTTS makes this possible. Think of it like having a voice actor in your pocket, ready to speak any text you provide, without needing an internet connection. You could use it to create audiobooks, give voice commands to your smart home devices, or even build toys that can speak in a natural-sounding voice. The key is that it works offline, so your data stays private and the voice is always available.

For the Technical Reader

NeuTTS utilizes small LLM backbones and a 50Hz neural audio codec (NeuCodec) to achieve high-quality speech synthesis. The models are available in GGML format, optimized for on-device inference. Key specifications include:

Supported Language: English
Context Window: 2048 tokens (~30 seconds of audio)
Models: NeuTTS-Air (~360m active parameters, ~552m with embeddings), NeuTTS-Nano (~120m active parameters, ~229m with embeddings)
Cloning: Both models support voice cloning with as little as 3 seconds of audio.
License: Apache 2.0 (NeuTTS-Air) and NeuTTS Open License 1.0 (NeuTTS-Nano)

Benchmarks show real-time generation speeds on mid-range devices. For example, NeuTTS-Nano achieves 45 tokens/s on a Galaxy A25 5G (CPU only) and 19268 tokens/s on an RTX 4090. See the HuggingFace page for model downloads and demos.

Why It Matters

NeuTTS represents a shift towards on-device voice AI, offering significant advantages in privacy, latency, and cost. By moving processing away from web APIs and onto local devices, NeuTTS unlocks new possibilities for embedded voice agents and compliance-safe applications. The open-source nature of NeuTTS-Air (Apache 2.0) fosters community development and innovation, while the smaller NeuTTS-Nano enables deployment on resource-constrained devices.

The "Voice AI Space Lab" Idea

Imagine building a personalized language learning app where users can practice pronunciation and receive instant feedback from a virtual tutor powered by NeuTTS. The app could even clone the user's voice to provide a more personalized and engaging learning experience.

The Collaborative CTA

Given the trend towards on-device AI, what are the biggest challenges you foresee in deploying TTS models like NeuTTS in real-world applications, and how can the community collaborate to overcome them? #VoiceAI #TTS

About neutts

For the Non-Technical Reader

For the Technical Reader

Why It Matters

The "Voice AI Space Lab" Idea

The Collaborative CTA