中文VGGT-Segmentor: 几何增强的跨视图分割

ENVGGT-Segmentor: Geometry-Enhanced Cross-View Segmentation

arXiv cs.CV2026年5月25日

这项研究聚焦于跨视角（自我中心与外部中心）的实例级物体分割难题，指出尺度、视角和遮挡变化导致像素级匹配不稳定。方法亮点：基于几何感知模型VGGT进行特征对齐，但发现其在密集预测中因像素投影漂移而失败。实际意义：揭示了现有模型在跨视角分割中的关键局限，为开发更鲁棒的几何-语义融合方法提供了方向，对具身AI和远程协作应用具有指导价值。

arXiv:2604.13596v3 Announce Type: replace Abstract: Instance-level object segmentation across disparate egocentric and exocentric views is a fundamental challenge in visual understanding, critical for applications in embodied AI and remote collaboration. This task is exceptionally difficult due to severe changes in scale, perspective, and occlusion, which destabilize direct pixel-level matching. While recent geometry-aware models like VGGT provide a strong foundation for feature alignment, we find they often fail at dense prediction tasks due to significant pixel-level projection drift, even w