中文我们距离用基础模型生成缺失模态还有多远?
ENHow Far Are We from Generating Missing Modalities with Foundation Models?
多模态基础模型在缺失模态重建中潜力被低估。本研究提出并形式化三种重建范式,评估42个模型变体的重建精度与下游任务适应性。结果表明,当前基础模型在这一任务中普遍表现不佳,需进一步优化。
arXiv:2506.03530v3 Announce Type: replace-cross Abstract: Multimodal foundation models have demonstrated impressive capabilities across diverse tasks. However, their potential as plug-and-play solutions for missing modality reconstruction remains underexplored. To bridge this gap, we identify and formalize three potential paradigms for missing modality reconstruction, and perform a comprehensive evaluation across these paradigms, covering 42 model variants in terms of reconstruction accuracy and adaptability to downstream tasks. Our analysis reveals that current foundation models often fall sh