中文重振密集材料分割:稳定视觉变换器与泛化悖论
ENRevitalizing Dense Material Segmentation: Stabilized Vision Transformers and the Generalization Paradox
本文复兴了Apple Dense Material Segmentation(DMS)基准,建立了基于Vision Transformer(ViT)的现代基线。通过详尽实验,克服了先前因几何偏置基础模型导致的基准停滞问题,提升了材料分割的像素级物理属性分类精度。该方法强化了对手感、材质等物理化学特征的理解,区别于传统对象解析,具有推动机器人操作、智能材料识别等实际应用的潜力。
arXiv:2605.23747v1 Announce Type: new Abstract: Material segmentation, the pixel-wise classification of physical surface properties, remains a challenging problem in computer vision, requiring physicochemical understanding distinct from object-centric parsing. Despite the introduction of the rigorous Apple Dense Material Segmentation (DMS) dataset, the benchmark has suffered from attrition and stagnation, increasingly overshadowed by geometry-biased foundation models. In this paper, we revive the Apple-DMS benchmark to establish a modern Vision Transformer baseline. We conduct an exhaustive ev