A conceptual figure of our main idea. The standard training is performed on the target scale, in which the objects and flows are colored green. Our training framework is ...
Abstract:
Computational requirements for training state-of-the-art neural network models are increasing on vision tasks because high computational factors have become known to be e...Show MoreMetadata
Abstract:
Computational requirements for training state-of-the-art neural network models are increasing on vision tasks because high computational factors have become known to be effective in improving quality. While research in the image-processing field requires a lot of trials, this trend makes proving hypotheses difficult for researchers in computationally restricted environments. Neural convolution with a wide receptive field is one of the high-computational factors with quality improvement. This study aims to accelerate the training of the large kernel convolutions by resizing both training images and convolution filters to a smaller scale. Applying this strategy requires careful training designs to replace conventional training on the target scale, and we propose four techniques to improve the quality of the trained models. In our experiment, we apply our proposals to train an image classifier model modified from RepLKNet-31B on the image classification task of the CIFAR-10, CIFAR-100, and STL-10 datasets. Our proposed framework trains almost the same models 4.62–4.91 times faster than the standard training on the target spatial scale, keeping its accuracy, and provides 2.61–2.79 times further training acceleration and stability in accuracy compared to Progressive Learning. In addition to the training acceleration, our framework can simultaneously train models for multiple scales without any scale-specific tuning, which provides scalable usage considering the computational costs.
A conceptual figure of our main idea. The standard training is performed on the target scale, in which the objects and flows are colored green. Our training framework is ...
Published in: IEEE Access ( Volume: 12)