中文PathNavigate:无需训练的病理智能体——基于惊喜引导扫描与共享切片记忆的全切片图像VQA
ENPathNavigate: A Training-Free Pathology Agent with Surprise-Guided Scan and Shared Slide Memory for Whole-Slide Image VQA
本文介绍一种无需训练的多模态框架WSI-VQA,通过结合预训练视觉编码器和大型语言模型,实现全切片图像的临床问答。该方法将导航与推理解耦,在有限的检查预算下定位稀疏关键区域,显著优于传统监督式多模态大模型,为病理诊断提供高效、灵活的AI辅助方案。
arXiv:2605.23559v1 Announce Type: new Abstract: Whole-slide image visual question answering (WSI-VQA) frames pathology as an extreme-context search problem: to answer a free-form clinical query, a system must first navigate a gigapixel slide under a strict inspection budget to locate sparse, high-resolution evidence. Existing approaches largely fall into two paradigms: i) supervised pathology multimodal large language models (MLLMs) and agents can absorb localization and reasoning into learned modules, but they often couple navigation to task-specific supervision and retraining, limiting their