Abstract:
Recently, there is a trend to develop deeper and wider Convolutional Neural Networks (CNNs) to improve task accuracy. Due to this reason, the GPU memory quickly becomes t...Show MoreMetadata
Abstract:
Recently, there is a trend to develop deeper and wider Convolutional Neural Networks (CNNs) to improve task accuracy. Due to this reason, the GPU memory quickly becomes the performance bottleneck since its capacity cannot keep up with the increase of the memory requirement of CNN models. Existing solutions exploit techniques such as swapping and recomputation to accommodate the shortage of memory. However, they suffer from performance degradations due to either the limited CPU-GPU bandwidth or the significant recomputation cost. This paper proposes a compression-based technique called FreeLunch that actively compresses the intermediate data to reduce the memory footprint of training large CNN models. Based on our evaluation, FreeLunch has up to 35% less memory consumption and up to 70% better throughput than swapping and recomputation.
Date of Conference: 14-14 November 2021
Date Added to IEEE Xplore: 20 December 2021
ISBN Information:
Keywords assist with retrieval of results and provide a means to discovering other relevant content. Learn more.
- IEEE Keywords
- Index Terms
- Convolutional Neural Network ,
- GPU Memory ,
- Memory Management ,
- Convolutional Neural Network Model ,
- Memory Consumption ,
- Memory Footprint ,
- Deep Neural Network ,
- Batch Size ,
- Feature Maps ,
- Deep Convolutional Neural Network ,
- Intermediate Results ,
- Convolutional Neural Network Training ,
- Compression Ratio ,
- Forward Propagation ,
- Compression Rate ,
- Single GPU ,
- Previous Policy ,
- Backward Propagation ,
- Peak Consumption ,
- Compression Algorithm ,
- Memory Allocation ,
- Large Batch Size ,
- Performance Overhead ,
- Forward Calculation ,
- Memory Operations ,
- Number Of Allocations ,
- Convolutional Layers ,
- AlexNet ,
- High Throughput ,
- Top-5 Accuracy
- Author Keywords
Keywords assist with retrieval of results and provide a means to discovering other relevant content. Learn more.
- IEEE Keywords
- Index Terms
- Convolutional Neural Network ,
- GPU Memory ,
- Memory Management ,
- Convolutional Neural Network Model ,
- Memory Consumption ,
- Memory Footprint ,
- Deep Neural Network ,
- Batch Size ,
- Feature Maps ,
- Deep Convolutional Neural Network ,
- Intermediate Results ,
- Convolutional Neural Network Training ,
- Compression Ratio ,
- Forward Propagation ,
- Compression Rate ,
- Single GPU ,
- Previous Policy ,
- Backward Propagation ,
- Peak Consumption ,
- Compression Algorithm ,
- Memory Allocation ,
- Large Batch Size ,
- Performance Overhead ,
- Forward Calculation ,
- Memory Operations ,
- Number Of Allocations ,
- Convolutional Layers ,
- AlexNet ,
- High Throughput ,
- Top-5 Accuracy
- Author Keywords