Speech Research Scientist | Bangalore
Smallest
Posted on 4/5/2026
Job Description
Speech Research Scientist — Bangalore
Team: Core Speech Research
Location: Bangalore, India
Type: Full-time
Experience: No fixed bar — skill and depth matter more than years
About Smallest.ai
Smallest.ai builds real-time voice intelligence systems operating at enterprise scale.
We work across speech recognition, speech generation, and speech-to-speech systems with a strong focus on low latency, multilingual intelligence, and production reliability.
Our goal is simple: Smaller models. Lower latency. Higher intelligence.
Role Overview
As a Speech Research Scientist, you will work on the core speech stack at Smallest.ai.
You will research, train, evaluate, and productionize models across:
Speech to Text (ASR)
Text to Speech (TTS)
Speech to Speech (S2S)
This is not an offline research role.
You will work at the intersection of research, engineering, and real-world deployment.
Core Research Areas
A. Automatic Speech Recognition (ASR)
Streaming and non-streaming ASR
Multilingual and code-mixed speech
Low-latency decoding and inference
Long-context speech modeling
Robustness to accents, noise, and telephony audio
B. Text to Speech (TTS)
Neural TTS and generative speech models
Controllable speech generation including emotion, style, pitch, rate, and prosody
Speaker adaptation and voice cloning
Stability, expressiveness, and naturalness optimization
C. Speech to Speech (S2S)
End-to-end speech-to-speech models
Streaming voice-to-voice architectures
Codec-based or token-based speech representations
Low-latency conversational speech generation
D. Multilingual and Speaker Intelligence
Multilingual speaker understanding
Cross-lingual speaker embeddings
Speaker identification and verification
Accent and dialect robustness
Low-resource language modeling
E. Multi-Speaker Modeling
Multi-speaker diarization
Overlapping speech detection and separation
Speaker-aware ASR pipelines
Joint diarization and recognition modeling
F. Duplex Conversational Models
Full-duplex speech models
Simultaneous listening and speaking
Interruption handling and barge-in detection
Half-duplex conversational models
Turn detection
Latency-aware response generation
What You Will Build
Novel model architectures and training strategies
Large-scale multilingual datasets and pipelines
Evaluation frameworks for WER, DER, MOS, latency, and RTF
Streaming inference systems for real-time speech
Research prototypes converted into production models
Your work will directly power live customer-facing systems.
Required Skills
Strong background in speech processing or deep learning
Deep expertise in at least one of the following:
ASR
TTS
Speech-to-speech systems
Strong understanding of modern architectures:
Transformers, Conformers, diffusion or flow-based models
Experience with CTC, Transducer, attention-based decoding
Strong proficiency in PyTorch
Experience training models at scale
Strong Plus
Multilingual speech experience (Indic or European languages)
Speaker embeddings and diarization systems
Parameter-efficient fine-tuning methods such as LoRA
Streaming inference optimization
Deployment experience using ONNX, TensorRT, or Triton
Publications, open-source contributions, or serious personal research projects
What We Care About
Depth over buzzwords
Clean experiments and reproducibility
Strong benchmarking discipline
Latency, memory, and throughput awareness
Research that translates into shipped systems
We value people who ask:
“How does this behave at scale?”
Not just: “Does this work on the dataset?”
Why Smallest.ai
Work on real-world speech systems at scale
Direct ownership from research to production
Close collaboration with founders and infrastructure teams
Fast iteration cycles with minimal bureaucracy
Competitive compensation and meaningful ESOPs
One of the deepest speech research stacks in India
How to Apply
It would be nice if you can also share:
Resume
Research papers, GitHub repositories, or technical writing
Examples of models you trained or systems you built
A short note on what aspect of LLM or memory research excites you most
Email: hetvi@smallest.ai