中文Imagine2Real: 基于视频生成先验的零样本人形物体交互
ENImagine2Real: Towards Zero-shot Humanoid-Object Interaction via Video Generative Priors
本研究提出Imagine2Real,零样本人形机器人-物体交互框架,解决3D数据稀缺导致的表示不对齐和重定向复杂问题。关键发现:无需几何先验(如CAD模型)和密集形态变形,实现灵活无几何交互。方法核心:基于视频生成先验,绕过显式3D模型。实际意义:降低对高保真数据的依赖,提升机器人交互的适应性与效率。
arXiv:2605.22272v2 Announce Type: replace-cross Abstract: Whole-body Humanoid-Object Interaction (HOI) is bottlenecked by the scarcity of high-fidelity 3D data. While video generative priors offer a promising alternative, existing methods suffer from \textit{Representation Misalignment} due to their reliance on geometric priors (e.g., explicit CAD models), and \textit{Retargeting Complexity} arising from intensive morphing and morphological mismatch. We propose Imagine2Real, a zero-shot HOI framework for flexible, geometry-free interaction. To resolve misalignment, we formulate robot and objec