Staff ML Research Engineer
Maple
Posted on 5/28/2025
Job Description
At Maple, we’re building AI agents that work for local businesses: restaurants, salons, repair shops, and everything in between. These agents answer calls, take orders, book appointments, and handle real customer interactions over natural voice.
But our bigger mission goes deeper: we’re building automated ontologies that model how businesses actually operate — their services, workflows, constraints, and language — so our agents can adapt to them instantly. We meet businesses where they are, not where software wants them to be.
We have many customers, strong revenue growth, years of runway, and backing from world-class investors. I’ll share more once we meet.
About the Role
As an ML Research Engineer at Maple, you'll be a part of our core product team transforming cutting-edge research into production-ready voice agents, serving millions of interactions for local businesses. Collaborate with experts from Google Brain, Two Sigma, Stanford, MIT, Columbia, and IBM, rapidly deploying advanced models and systems that directly impact small businesses.
We work in person, 5 days a week in our NYC office. Collaboration here is fast, noisy (in the best way), and high-trust. We move quickly, break things intentionally, and fix them just as fast.
What You'll Do
Optimize speech recognition (ASR), large language models (LLMs), and text-to-speech (TTS) for real-world use, ensuring accuracy in diverse, noisy environments.
Fine-tune LLMs with retrieval-augmented generation (RAG), reinforcement learning (RL), and prompt engineering for dynamic, context-aware conversations.
Integrate AI components into autonomous agents capable of complex tasks like scheduling, order-taking, and issue resolution.
Create human-in-the-loop and automated systems to monitor performance, detect anomalies, and continuously improve models from real-world feedback.
Develop pipelines to construct knowledge graphs from business data, powering adaptive AI interactions.
Work with infrastructure teams to scale models efficiently across GPU/TPU clusters and edge devices, minimizing latency.
Manage rapid experimentation, training, and highly optimized production inference.
Lead evaluations, error analysis, and iterative improvements to maintain robustness and scalability.
Balance research innovation with practical usability by closely working with product and customer teams.
Publish research, contribute to open-source, and present at industry-leading conferences.
What We're Looking For
3-7+ years deploying impactful ML models, ideally in voice, NLP, knowledge graphs, or agent systems.
Deep knowledge in speech recognition, language models, RL/dialogue systems, TTS, ontology systems, or agent orchestration.
Proficiency in PyTorch or JAX; optimization experience with CUDA/Triton preferred.
Proven ability to minimize latency and resource use on GPUs/TPUs or edge hardware.
Strong data-driven approach with measurable improvements.
Passion for creating intuitive, helpful, and frustration-free AI experiences.
BS, MS, or PhD in Computer Science, Electrical Engineering, Mathematics, or equivalent practical expertise.
How we work
We optimize for leverage. That means great internal tooling, fast CI/CD, and code that scales across many customer types
We believe in deep ownership. Engineers here talk to users, design features, and ship fast
We value clarity over process. You’ll spend most of your day building, not waiting on decisions
We move in person. We’re a tight-knit team that moves fast and solves problems together
What we offer
Competitive salary + meaningful equity
A real product with real usage and growing revenue
Strong In-person culture, fast feedback loops, and zero bureaucracy
A small team that feels like a founding team
Full health, dental, vision, 401k, life insurance, and unlimited PTO
Tools budget, coffee budget, whatever-you-need-to-be-great budget
Want to help reimagine how software works for real-world businesses? Let’s talk.