NeuralDome: A Neural Modeling Pipeline on Multi-View Human-Object Interactions

Juze Zhang1,2,3,4*, Haimin Luo1,4*, Hongdi Yang1,4, Xinru Xu1,4, Qianyang Wu1,4, Ye Shi1,4, Jingyi Yu1,4 Lan Xu1,4 Jingya Wang1,4
1ShanghaiTech University, 2Shanghai Advanced Research Institute, Chinese Academy of Sciences 3University of Chinese Academy of Sciences 4Shanghai Engineering Research Center of Intelligent Vision and Imaging *These authors contributed equally. Corresponding author

Abstract

Humans constantly interact with objects in daily life tasks. Capturing such processes and subsequently conducting visual inferences from a fixed viewpoint suffers from occlusions, shape and texture ambiguities, motions, etc. To mitigate the problem, it is essential to build a training dataset that captures free-viewpoint interactions. We construct a dense multi-view dome to acquire a complex human object interaction dataset, named HODome, that consists of ∼71M frames on 10 subjects interacting with 23 objects. To process the HODome dataset, we develop NeuralDome, a layer-wise neural processing pipeline tailored for multi-view video inputs to conduct accurate tracking, geometry reconstruction and free-view rendering, for both human subjects and objects. Extensive experiments on the HODome dataset demonstrate the effectiveness of NeuralDome on a variety of inference, modeling, and rendering tasks. Both the dataset and the NeuralDome tools will be disseminated to the community for further development.

Video

BibTeX

@inproceedings{
      zhang2023neuraldome,
      title={NeuralDome: A Neural Modeling Pipeline on Multi-View Human-Object Interactions},
      author={Juze Zhang and Haimin Luo and Hongdi Yang and Xinru Xu and Qianyang Wu and Ye Shi and Jingyi Yu and Lan Xu and Jingya Wang},
      booktitle={CVPR},
      year={2023},
}