中文论梯度对语言辅助图像聚类的可证明重要性
ENOn the Provable Importance of Gradients for Language-Assisted Image Clustering
本文研究语言辅助图像聚类(LaIC),利用文本语义提升视觉表征判别性。核心挑战在于真实类名未知,需从未标注语料中过滤与图像语义相近的阳性名词。现有过滤策略主要基于CLIP的现成特征空间。
arXiv:2510.16335v4 Announce Type: replace Abstract: This paper investigates the recently emerged problem of Language-assisted Image Clustering (LaIC), where textual semantics are leveraged to improve the discriminability of visual representations to facilitate image clustering. Due to the unavailability of true class names, one of core challenges of LaIC lies in how to filter positive nouns, i.e., those semantically close to the images of interest, from unlabeled wild corpus data. Existing filtering strategies are predominantly based on the off-the-shelf feature space learned by CLIP; however,