中文LaMo:自监督潜在运动先验实现视频生成中的物理真实性
ENLaMo: Self-Supervised Latent Motion Priors for Physical Realism in Video Generation
现代视频生成器缺乏物理与运动一致性。LaMo提出自监督方法,从训练视频的无标签数据中提取运动线索,构建基于当前潜在表示和文本提示的帧间潜在变化运动先验,无需外部模拟器或物理数据集。
arXiv:2605.23878v1 Announce Type: new Abstract: Modern video generators produce visually compelling clips but still struggle with physical and motion consistency, limiting their use as reliable world simulators. Existing remedies often rely on external simulators, teacher models, or curated physics-focused data. We explore a complementary self-supervised direction: extracting motion cues from the unlabeled videos already used to train video diffusion models. We propose LaMo, which formulates a latent motion prior over frame-to-frame latent changes conditioned on the current latent and prompt.