中文Geo-Align: 基于度量几何奖励的视频生成对齐

ENGeo-Align: Video Generation Alignment via Metric Geometry Reward

arXiv cs.CV2026年5月25日

现有相机控制视频生成方法依赖合成数据集微调，缺乏真实多视角同步视频数据，导致泛化性差、难以准确遵循物理尺度和相机轨迹。本文提出Geo-方法，旨在弥补这一差距，提升对真实世界视频的通用性及精度。

arXiv:2605.23903v1 Announce Type: new Abstract: Camera-controlled video generation has achieved remarkable progress in recent years. However, existing video-to-video re-rendering methods primarily rely on Supervised Fine-Tuning using synthetic datasets. At present, there is an extreme scarcity of synchronized, multi-view real-world video data. Consequently, the prevailing paradigm often exhibits limited generalization when processing out-of-distribution real-world videos, with models struggling to accurately adhere to physical scales and camera trajectories. To bridge this gap, we propose Geo-