Until now, VR systems have tracked the head and hands. This could soon change: the predictive talent of artificial intelligence enables realistic full-body tracking and therefore better avatar embodiment based solely on data from the sensors of the helmet and controllers.
Meta has already demonstrated that AI is a fundamental technology for virtual reality and augmented reality with hand tracking for Quest: a neural network trained with many hours of hand movements enables robust hand tracking even with the Quest headset’s low-resolution cameras, which aren’t specifically optimized for hand tracking.
This is powered by the predictive talent of artificial intelligence: thanks to the prior knowledge acquired during the training, a few inputs from the real world are enough for an accurate translation of the hands in the virtual world. A full real-time acquisition including VR rendering would require a lot more power.
From hand tracking to body tracking via AI prediction
In a new project, Meta researchers transfer this principle of hand tracking, i.e. the most plausible and physically correct simulation of virtual body movements based on real movements by training an AI with tracking data. previously collected, to the whole body. QuestSim can realistically animate a full avatar using only sensor input from the headset and two controllers.
The Meta team trained QuestSim AI with artificially generated sensor data. For this, the researchers simulated the movements of the headset and controllers based on eight hours of motion capture clips from 172 people. That way, they didn’t have to capture headphone and controller data along with body movements from scratch.
The motion capture clips included 130 minutes of walking, 110 minutes of jogging, 80 minutes of informal conversation with gestures, 90 minutes of whiteboard discussion and 70 minutes of balance. The avatar simulation training with reinforcement learning lasted about two days.
Post workout, QuestSim can recognize the movement a person makes based on the actual data from the headset and controller. Using AI prediction, QuestSim can even simulate the movements of body parts such as legs for which no real-time sensor data is available, but for which the simulated movements were part of the set of synthetic motion capture data, i.e. learned by AI. For plausible movements, the avatar is also subject to the rules of a physical simulator.
The helmet alone is enough for a believable full-body avatar
QuestSim works for people of different sizes. However, if the avatar differs from the proportions of the real person, it affects the animation of the avatar. For example, a tall avatar for a short person walks bent over. The researchers still see potential for optimization in this.
Meta’s research team also shows that helmet sensor data alone, along with AI prediction, is sufficient for a believable and physically correct animated full avatar.
AI motion prediction works best for movements that have been included in training data and have a strong correlation between upper body and leg movement. For complicated or very dynamic movements like fast sprints or jumps, the avatar may skid or fall. Also, since the avatar is physics-based, it does not support teleportation.
In further work, Meta researchers want to incorporate more detailed skeletal and body shape information into the training to improve the avatars’ movement variety.