中文CHASD:语言增量校准的对比解码对抗LVLM幻觉
ENCHASD: Language Increment-Calibrated Contrastive Decoding against Hallucination in LVLMs
大型视觉-语言模型存在对象幻觉,源于语言先验主导而视觉证据不足。现有对比解码方法通过全局扰动或逐步负分支来缓解,但可能破坏有用视觉证据或增加计算开销。本文提出关键令牌感知对比解码(KCI-CD),仅对最可能引发幻觉的视觉令牌进行扰动,有效减少幻觉且保持生成质量,无需额外推理分支。
arXiv:2605.23344v1 Announce Type: new Abstract: Large Vision-Language Models have shown strong multimodal reasoning capabilities, yet they remain susceptible to object hallucinations when language priors dominate insufficient or misaligned visual evidence. Training-free contrastive decoding methods mitigate this issue by comparing predictions from original and perturbed visual inputs, but existing approaches either apply global perturbations that may alter useful visual evidence or invoke an additional negative branch at every decoding step. In this paper, we observe that hallucination risks a