AI Voice Commands
Recognizing phrases uttered using Meta's Wit.ai
Gesture Recognition
OpenXR Gesture Recognition
Overview
This project explores multi-modal interaction techniques for collaborative environments in Extended Reality, focusing on AI voice commands and gesture recognition. The implementation is tested on a Meta Quest 2 and Meta Quest Pro.
Multiplayer Implementation
Since we were considering adding voice chat and voice commands, I explored Unity's Netcode for GameObjects as well as the Photon: Fusion Multiplayer engine. Since Unity's Netcode for GameObjects doesn't support voice chat, I ended up switching to the Photon: Fusion Multiplayer engine.

For the voice modality, I decided to use a voice command recognition approach through Wit.ai, Meta's Natural Language Processing API, that connects directly to the Meta SDK in Unity. With Wit.ai, you can choose an utterance and then set its intent. In Unity, you just add your API key, and set up listener events and fire functions when these events are triggered.
Made in collaboration with Divya Mahesh (Product Manager), Revati and Xiang
Aarnav Sangekar 2025