Back to Search
T

TACO

Datasetactive

TACO is a large-scale bimanual hand-object-interaction dataset introduced at CVPR 2024, designed to address the lack of multi-object manipulation data in existing hand-object interaction research. Unlike prior datasets that focus on single-hand, single-object interactions, TACO captures the complexity of real-world bimanual manipulation across diverse tool-action-object compositions. The dataset comprises 2,500 motion sequences recorded from third-person and egocentric viewpoints simultaneously, using a fully automatic data acquisition pipeline that combines multi-view sensing with optical motion capture. Each sequence includes precise hand-object 3D meshes, action labels, and object affordance annotations, enabling fine-grained analysis of manipulation strategies and object functional regularities. TACO covers a wide variety of daily activities involving multiple objects, such as cutting with scissors, pouring with a kettle, hammering nails, and assembling objects. Each interaction is annotated with tool-action-object triplets, capturing both the semantic and geometric relationships between hands and objects. The dataset spans diverse object shapes, sizes, and functionalities, providing rich training data for models that need to generalize manipulation skills to novel objects. A key contribution of TACO is its focus on understanding object functional regularities — how the physical properties and affordances of objects determine the appropriate manipulation strategy. This makes it valuable for research in dexterous manipulation, affordance learning, and skill transfer from human to robot domains. The multi-view setup also supports research in 3D hand reconstruction, object pose estimation, and human-object interaction understanding. TACO has implications for embodied AI research as a bridge between human manipulation data and robot learning. The detailed motion and contact annotations can inform robot policy learning through imitation, especially for bimanual and contact-rich manipulation tasks that remain challenging for current robot systems.

Details

Updated:6/20/2026
sample count2500
modalityRGB video, 3D meshes, depth, action labels
licenseresearch

Tags

bimanual manipulationhand-object interactiondexterous manipulationhuman motioncontact-rich manipulationmulti-view capture3D meshesCVPR 2024

Relationships

No relationships found.

Sources

https://taco2024.github.io
website
Visit
https://arxiv.org/abs/2401.08399
paper
Visit

Related Knowledge Pages

No related knowledge pages.
TACO | Dataset | EmbodiedHub