I. Introduction
In The last few years, we have witnessed a flourish of convolutional neural networks (CNNs) in computer vision research and applications. To improve the performance of CNNs, recent works add more convolutional layers to the CNN architectures. For instance, from 8-layer AlexNet [1] to 1000-layer ResNet [2], [3], in order to achieve higher accuracy for image recognition. However, more learnable layers introduce more parameters and prolong inference time.