reachy-personal-assistant
AI personal assistant for Reachy Mini robot, using NVIDIA's NeMo Agent Toolkit and Nemotron models for interaction and control.
About reachy-personal-assistant
This repository showcases an AI agent built with the NVIDIA NeMo Agent Toolkit, controlling a Reachy Mini Robot using NVIDIA Nemotron models. The agent intelligently routes between Nemotron nano text, Nemotron nano VLM (Vision Language Model), and a REACT agent for tool-based actions.
For the Non-Technical Reader
Imagine having a personal assistant robot that can understand both your words and what it sees. This project is like giving a Reachy Mini Robot a brain that can process information from its camera and microphone. It can answer questions, perform actions, and even show emotions through dance moves. Think of it as a smart helper that can respond to your commands and understand its environment, making it useful for tasks like fetching objects or providing information based on visual cues.
For the Technical Reader
The system architecture comprises three parallel components: a Reachy Mini Daemon, a Bot Service, and a NeMo Agent Service. The NeMo Agent Service utilizes an intelligent LLM router to dynamically switch between Nemotron nano text for text-based interactions, Nemotron nano VLM for visual understanding, and a REACT agent for tool-based actions. The Bot Service handles vision processing, speech recognition/text-to-speech, robot movement coordination, and emotional expressions. The setup requires Python 3.10+, the uv package manager, and NVIDIA/ElevenLabs API keys. The robot can be run in simulation mode or with actual hardware.
Why It Matters
This project demonstrates the potential of combining robotics and advanced AI models. By leveraging open-source tools and readily available models, it lowers the barrier to entry for creating sophisticated robotic assistants. The use of NVIDIA's Nemotron models highlights the increasing accessibility of powerful AI, while the integration with a physical robot opens up new possibilities for human-robot interaction. This could lead to more affordable and customizable personal robots. #OpenSource #Robotics
The "Voice AI Space Lab" Idea
Imagine building a "Smart Home Choreographer" using this setup. The Reachy Mini Robot could be programmed to identify messes in a room (using its vision capabilities) and then provide voice prompts to family members to clean up specific items, all while playing motivational music and doing a little dance to encourage participation!
The Collaborative CTA
How can we enhance the NeMo Agent's routing capabilities to better handle ambiguous or complex multimodal queries, ensuring more accurate and context-aware responses from the Reachy Mini Robot?