Interview
Mohamed Hani ElMasry
Founder & CEO
SentiVue
How did you end up in voice?
I didn't start with voice AI as a category. I started from frustration: the phone is still where many of the most important conversations happen, but it has barely changed for people who don't share a language. Translation had advanced in text. AI agents were making rapid progress in chat. But live spoken conversations, especially phone calls and in-person events, were still messy, slow, and often dependent on a human interpreter or an app both sides needed to use. Voice pulled me in because it exposes everything. You can't hide behind a nice interface. Latency, accent, tone, turn-taking, and trust all show up immediately. That makes it hard but also worth building.
What's your struggle or moment of joy with voice?
The biggest lesson for me was discovering how unforgiving voice really is. In text, you can tolerate a lot. You can wait a few seconds for a response. You can reread a sentence. You can correct mistakes. Humans are surprisingly patient. Voice doesn't work that way. A system can be 95% accurate and still feel broken because it interrupted at the wrong moment or responded half a second too late. Most people outside the industry underestimate how difficult that last 5% is. One of my favorite moments came during a live event. We spent months focusing on models, infrastructure, and latency. Then, during the event, two people on stage had a panel discussion and they didn't speak the same language. The audience followed. Nobody discussed the technology. Nobody asked how it worked. For me, that's the highest compliment in voice. The technology disappears and the conversation remains.
Where do you think voice is going?
The industry is still focused too much on voice assistants and not enough on communication infrastructure. Most conversations about voice AI are about replacing humans. There's a much bigger opportunity in enabling humans. The most interesting use cases I've seen aren't people talking to AI. They're people talking to other people through AI. Language translation is one example, but there are many others. Accessibility, customer service, public services, healthcare, and global collaboration all become fundamentally different when language and communication barriers start disappearing. I also think voice will move closer to the network itself. Today, most voice AI lives inside apps. Over time, more intelligence will sit underneath the experience, embedded in the communication layer itself. If that happens, users won't think about voice AI at all. They'll simply expect every conversation to work, regardless of language, device, or location. That's the future I find most interesting.
Key Links
https://www.voiceaispace.com/tool/sentivue-events