中文面向深度伪造定位的不一致性感知多模态薛定谔桥

ENInconsistency-aware Multimodal Schr\"odinger Bridge for Deepfake Localization

arXiv cs.CV2026年5月25日

音频-视觉深度伪造定位新方法IaMSB：采用不一致性感知的多模态Schrödinger桥，联合估计跨模态一致性并进行区间级定位。与扩散模型不同，该方法无需显式噪声注入，通过最小化路径分布差异生成一致性分数，有效抑制对称融合下的交叉模态噪声传播，提升高精度定位性能。

arXiv:2605.23113v1 Announce Type: new Abstract: Audio-visual deepfake localization demands interval-level outputs that serve as temporal evidence. Despite recent progress, symmetric fusion under single-sided or asynchronous forgeries propagates cross-modal noise, degrading high-precision localization. We present IaMSB, an inconsistency-aware multimodal Schr\"odinger Bridge (SB) that jointly estimates cross-modal consistency and performs interval-level localization. Unlike diffusion models, SB minimizes path-distribution discrepancy and yields consistency scores without explicit noise injection