中文EvalVerse:面向专业级电影视频生成的流水线感知与专家校准评测
ENEvalVerse: Pipeline-Aware and Expert-Calibrated Benchmarking for Professional Cinematic Video Generation
现有视频生成模型评估存在瓶颈:多数基准仅关注“是否正确”(基本提示跟随),忽略“是否良好”(电影质量、表演、美学)。该研究提出应借助强化学习与智能体工作流,转向专业级电影合成质量评估。方法论亮点:评估需涵盖更全面的审美与演技维度。实际意义:推动生成视频从“对错”转向“优劣”评估。
arXiv:2605.23271v1 Announce Type: new Abstract: The rapid evolution of generative video foundation models has propelled the field toward professional-grade cinematic synthesis. To achieve such demanding quality, the community transitions towards Reinforcement Learning (RL) and agentic workflows. However, reliable evaluation has emerged as a critical bottleneck. Existing benchmarks predominantly evaluate ''whether it is right'' (basic prompt-following) while fundamentally neglecting ''whether it is good'' (cinematic quality, acting, and aesthetics). Furthermore, current automated metrics lack t