中文ChainFlow-VLA：利用视觉-语言模型进行因果流规划

ENChainFlow-VLA: Causal Flow Planning with Vision-Language Models

arXiv cs.CV2026年5月25日

当前端到端自动驾驶系统存在时序因果推理与全局轨迹一致性的根本矛盾。自回归模型通过因果分解捕捉交互依赖，但逐步解码导致误差累积；扩散模型全局优化轨迹却缺乏因果约束，在交互及安全关键场景不可靠。该研究揭示了两类方法的深层缺陷。

arXiv:2605.23270v1 Announce Type: new Abstract: Current end-to-end autonomous driving systems are fundamentally limited by a mismatch between temporal causal reasoning and global trajectory consistency. Autoregressive (AR) models capture interaction-aware temporal dependencies via causal factorization, but their step-wise decoding leads to error accumulation and suboptimal global structure. In contrast, diffusion models optimize trajectories globally but lack explicit causal constraints, making them unreliable in interactive and safety-critical scenarios. This dichotomy reveals a deeper issue: