New: the Voice AI Investors list release! Check it out

    Machine Learning Scientist

    Rime

    Engineering
    Full-time
    Remote
    United States | Remote

    Posted on 5/25/2026

    Job Description

    Machine Learning Scientist

    Rime builds voice AI for enterprises running customer experiences at scale. Our text-to-speech models are purpose-built for high-volume conversational deployments, engineered for the pronunciation accuracy, latency, and deployment flexibility that production environments actually demand.

    We started from a different premise than the rest of the field: voice AI isn't bottlenecked by model architecture. It's bottlenecked by data. So before we trained a single model, we built our own corpus: full-duplex, studio-quality conversational speech, recorded and annotated by PhD linguists. That's our moat. It's also why enterprises pick Rime when pilots need to convert into production.

    We're backed by top-tier investors including Unusual Ventures, and we've built a team at the intersection of product, research, and craft. Building voice models is an art. We intend to master it.

    Role Overview

    We're hiring a Machine Learning Scientist to push the frontier of speech synthesis and speech understanding at Rime.

    What You'll Own

    • Design, train, and evaluate speech synthesis models, autoregressive and non-autoregressive.

    • Drive research on full-duplex and half-duplex multi-modal architectures, including unified S2S systems.

    • Choose and iterate on speech representations: neural codecs, semantic tokens, mel features, continuous latents.

    • Build rigorous evaluation, objective and perceptual. Hold the bar on quality and prosodic control.

    • Collaborate with our linguists on TTS frontend behavior so modeling and frontend choices reinforce each other.

    What We're Looking For

    • Deep familiarity with the speech synthesis literature, contemporary and historical — Tacotron, FastSpeech, VITS, VALL-E, the codec-LM lineage. Opinions on what worked and why.

    • Hands-on training with neural codecs (EnCodec, DAC, Mimi, etc.) and multiple representation choices.

    • Experience with full- or half-duplex multi-modal modeling (Moshi, LLaMA-Omni, streaming S2S).

    • Strong attention to detail on data quality. You notice when an annotation pipeline is silently degrading or when an eval set has leakage.

    • Willing to roll up your sleeves on unglamorous data and training work — paired with the agency to build pipelines so the team isn't stuck doing it by hand.

    • Working knowledge of TTS frontend (G2P, normalization, prosody) and experience working with linguists.

    • Strong PyTorch fundamentals. Comfortable with training loops, distributed training, model internals.

    • PhD or equivalent research experience in speech, audio, ML, or computational linguistics or a track record that makes the credential irrelevant.

    Nice to have

    • Multilingual TTS experience.

    • Background in prosody or paralinguistics.

    • Published work in speech, audio, or core ML venues.

    • Experience taking research models to production: quantization, distillation, streaming inference.

    Why Join Rime

    • Category-defining voice AI infrastructure, not incremental research deltas.

    • Direct collaboration with founders, including a CEO with a Stanford computational linguistics PhD.

    • Real impact on company trajectory.

    • Meaningful equity upside.

    • High ownership, high standards, low bureaucracy.

    What We Offer

    • Competitive base + meaningful early-stage equity

    • Remote-friendly

    • Visa sponsorship available

    • Access to a proprietary, full-duplex, studio-quality conversational speech corpus

    • Compute and tooling to do the work

    • Direct influence on the future of voice AI

    At Rime, we...

    • Are outliers

    • Cut through the hype to focus on the craft

    • Move fast with agency and freedom

    • Maintain a growth mindset, finding joy in the struggle

    • Do the right things, knowing that it'll lead to making money


    If that sounds like you too, you'll be a great fit for Rime!