top of page
Research           My Story             Reviewing

 

​​

My primary research interest is machine learning (ML) and computer vision (CV).

The key technology area(a) that motivates my work include

  • AI for health and wellness 

  • AI systems that enhance human perception & reduce cognitive load

  • AI for individual & ecological wellbeing

In the past decade, my technical contributions have spanned various AI domains : 

  • Multimodal ML for conversational & social understanding 

  • Egocentric (First Person) computer vision

  • ML systems with Human-in-the-Loop

  • Personalization of audio-visual content 

  • On-device multimodal machine learning (specifically wearables) 

  • Superhuman hearing & wearables based hearing aids 

  • Enhancing audio/sounds perceptual realism in virtualization 

  • Assistants for wearables personalization 

@Meta

​​

At Meta reality labs research, my focus is on demonstrating wearables form-factor (SG and AR in particular) as a medium to improve human health and wellbeing, especially in conversational and social scenarios.

  • How can we use wearables to enhance human hearing capacity (e.g., enabling conversations in high noise settings like restaurants, social events, etc; enhancing hearing capacity for hearing impaired individuals) ? 

  • How can we make virtual conversations perceptually realistic, increase their realism, and thereby increase social connection and presence ?

  • How can we reduce audio and sounds cognitive load in daily settings with wearables and meta AI ? 

  • What general purpose multimodal AI systems (camera + microphone + gyrometric) are needed to discover and drive audio-driven use-cases with wearables form-factor?

My team's approach to answering these questions has been to innovate on all-day egocentric AI with physics/domain driven perceptual and cognitive knowledge; along with e2e system-level AI optimization. 

My broader org at Reality Labs research discovers, designs, develops and builds egocentric multi-sensory systems for Audio & Conversational experiences by bringing together : AI/DSP always-on systems, wearable device physics and human auditory perception. 

Some references to the work done by me / my team over the past decade:

Conversation Focus on Smart glasses
-- Connect 2025 https://www.youtube.com/live/D97ILdUbYww?si=7qQE0B9k_5YphDQz&t=2292
-- Connect dev keynote https://www.youtube.com/watch?v=KWdUxc24dIw&t=4930s

Inside Facebook Reality Labs Audio: The future of Audio
-- https://about.fb.com/news/2020/09/facebook-reality-labs-research-future-of-audio/

Ego4d Reveal Session at EPIC Workshop, ICCV 2021​
-- https://youtu.be/2dau0W0NVQY?si=mWhfdjzpGcwcKfHJ&t=3302

CVPR 2023 Sight & Sound workshop talk
-- https://www.youtube.com/live/6TaZT2u1jJ8?si=_O8tCl2y_2ppOCfi&t=32716

​Opensource Datasets​​

My group has also been vital in helping build large scale open source-able datasets for egocentric machine perception research. Check out the following two datasets and benchmarks we have designed over the past few years. 

Ego4d: Around the world in thousands of hours of video

EasyCom: An Augmented Reality Dataset to Support Algorithms for Easy Communication in Noisy Environments

 

@grad-school

Prior to moving into Meta Reality Labs, I was working primarily in the following two application domains, building novel machine and computer vision models / systems. 

  • Brain imaging and Clinical Trials design

  • Multi-modal Predictive Modeling of Alzheimer's disease

@ Oct 2020; 

bottom of page