中文利用学习的世界到图像投影改进视觉到海图浮标关联
ENImproved Vision-to-Chart Buoy Association with Learned World-to-Image Projection
在MaCVi 2026视觉-图表数据关联挑战中,对DETR融合Transformer基线进行轻量修改。原基线解码器以编码世界距离和方位的浮标查询隐式学习几何投影。本文训练专用MLP(QueryMLP),利用图表测量和IMU方向显式预测浮标水线接触点在图像中的位置,简化学习任务,提升数据关联准确性。
arXiv:2605.22942v1 Announce Type: new Abstract: This report presents a lightweight modification to the DETR-based fusion transformer baseline for the MaCVi 2026 Vision-to-Chart data association challenge. The challenge baseline decoder receives per-buoy queries encoding world-space distance and bearing, forcing the transformer to implicitly learn the complex geometric projection from world coordinates to image pixels. Instead, this work trains an additional dedicated MLP, QueryMLP, to explicitly predict the buoy's waterline contact point in the image from chart measurements and IMU orientation