中文递归块对角耦合用于视觉模型的资源高效训练
ENRecursive Block-Diagonal Coupling for Resource-Efficient Training of Vision Models
从零训练高容量视觉模型需大量计算资源。现有生长方法常假设已有窄模型,掩盖整体成本。本文提出RBDC协议,通过无参数块对角方式递归耦合独立训练的窄模型构建宽模型,灵活分配训练预算,显著降低总计算开销,提升训练效率。
arXiv:2605.23656v1 Announce Type: new Abstract: Training high-capacity vision models from scratch requires substantial computational resources. To improve training efficiency of a wide target model, existing growth methods often assume the availability of narrower models, obscuring the true computational cost of the entire pipeline. We propose an efficient training protocol, RBDC, that builds wide models by coupling in a parameter-free block-diagonal way narrower, independently trained models in a recursive way. This allows a flexible allocation of the training budget available across all the