中文VideoOdyssey:面向超长上下文与全模态视频理解的基准测试
ENVideoOdyssey: A Benchmark for Ultra-Long-Context and Omni-Modal Video Understanding
现有长视频理解基准测试仅评估短视频片段理解,无法衡量超长上下文推理的挑战。本文针对现实长视频理解所需连续跟踪、信息整合与记忆保持的核心瓶颈,提出新的评估框架。
arXiv:2605.22907v1 Announce Type: new Abstract: Real-world long video understanding requires models to perform continuous tracking, information integration and memory retention over massive temporal spans within extreme video durations. Mastering this intense cognitive load constitutes the fundamental bottleneck in long video understanding. While existing benchmarks have driven progress by scaling up video duration, their evaluation tasks often require comprehending only short and isolated video segments, falling short of capturing the challenge of ultra-long-context reasoning. To measure this