中文通过句子级早期干预缓解物体幻觉

ENMitigating Object Hallucinations via Sentence-Level Early Intervention

arXiv cs.CV2026年5月25日

多模态大语言模型（MLLM）常产生与视觉输入矛盾的幻觉。研究发现幻觉主要出现在文本生成的早期阶段并传播。为此提出SENTINEL（句子级干预），在早期阶段抑制幻觉，避免高昂计算成本与数据分布不匹配问题，有效提升模型可靠性。

arXiv:2507.12455v3 Announce Type: replace Abstract: Multimodal large language models (MLLMs) have revolutionized cross-modal understanding but continue to struggle with hallucinations - fabricated content contradicting visual inputs. Existing hallucination mitigation methods either incur prohibitive computational costs or introduce distribution mismatches between training data and model outputs. We identify a critical insight: hallucinations predominantly emerge at the early stages of text generation and propagate through subsequent outputs. To address this, we propose SENTINEL (Sentence-level