中文LangFlash:基于稀疏无位姿图像的前馈三维语言高斯溅射
ENLangFlash: Feed-forward 3D Language Gaussian Splatting from Sparse Unposed Images
本文提出LangFlash,一种前馈式3D语言高斯溅射框架,能从稀疏无位姿多视图图像中直接预测几何与语义,无需优化迭代,实现低延迟3D重建及语言一致场景理解。为支持大规模训练,作者丰富了RealEstate10k数据集,添加了连贯密集的语义信息。
arXiv:2605.23287v1 Announce Type: new Abstract: We present LangFlash, a feed-forward framework for 3D Language Gaussian Splatting that reconstructs 3D scenes parameterized by Gaussian primitives enriched with language-aligned semantic features from sparse unposed multi-view images. Unlike optimization-based 3D methods, LangFlash directly predicts the geometry and semantics in a single forward pass, enabling low-latency 3D reconstruction and language-consistent scene understanding. To support large-scale training, we enriched the RealEstate10k dataset with coherent and dense semantic informatio