中文Geo-Align: 基于度量几何奖励的视频生成对齐
ENGeo-Align: Video Generation Alignment via Metric Geometry Reward
现有相机控制视频生成方法依赖合成数据集微调,缺乏真实多视角同步视频数据,导致泛化性差、难以准确遵循物理尺度和相机轨迹。本文提出Geo-方法,旨在弥补这一差距,提升对真实世界视频的通用性及精度。
arXiv:2605.23903v1 Announce Type: new Abstract: Camera-controlled video generation has achieved remarkable progress in recent years. However, existing video-to-video re-rendering methods primarily rely on Supervised Fine-Tuning using synthetic datasets. At present, there is an extreme scarcity of synchronized, multi-view real-world video data. Consequently, the prevailing paradigm often exhibits limited generalization when processing out-of-distribution real-world videos, with models struggling to accurately adhere to physical scales and camera trajectories. To bridge this gap, we propose Geo-