Back to Search
V

VLK

Projectactive

VLK addresses perception-based humanoid loco-manipulation by generating vision-language-kinematics (VLK) synthetic data from reconstructed scenes. The pipeline leverages 3D Gaussian Splatting to reconstruct metric-scale indoor environments, synthesizes navigation and object-interaction trajectories using privileged scene information, and renders paired egocentric observations. It produces 48,000 paired trajectories with no human intervention. A VLK policy trained on this data predicts short-horizon whole-body kinematic trajectories, which are converted to actions on the physical Unitree G1 via a whole-body tracker. Evaluated on navigation and single-object transport tasks.

Details

Updated:7/1/2026

No structured details available.

Tags

humanoidloco-manipulationsynthetic data3D Gaussian SplattingUnitree G1vision-language-kinematics

Relationships

Sources

VLK: Learning Humanoid Loco-Manipulation from Synthetic Interactions in Reconstructed Scenes
academic_paper
Visit

Appears In

No related landscapes.