Journals & Magazines >IEEE Computer Architecture Le... >Volume: 21 Issue: 2

Runtime Support for Accelerating CNN Models on Digital DRAM Processing-in-Memory Hardware

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Processing-in-memory (PIM) provides promising solutions to the main memory bottleneck by placing computational logic in or near memory devices to reduce data movement ove...Show More

Metadata

Abstract:

Processing-in-memory (PIM) provides promising solutions to the main memory bottleneck by placing computational logic in or near memory devices to reduce data movement overheads. Recent work explored how commercial DRAM can feature digital PIM logic while meeting fab-level energy and area constraints, and showed a significant speedup in the inference time of data-intensive deep learning models. However, convolutional neural network (CNN) models were not considered as main targets for the commercial DRAM-PIM due to their compute-intensive convolution layers. Moreover, recent studies revealed that the area and power constraints on memory die prevent DRAM-PIM from competing with GPUs and specialized accelerators in accelerating them. Recently, mobile CNN models have increasingly adopted a composition of depthwise and pointwise convolutions instead of such compute-intensive convolutions to reduce computation cost without accuracy drop. In this paper, we show that 1x1 convolution can be offloaded for PIM acceleration with integrated runtime support and without any hardware or algorithm changes. We provide further speedup with parallel execution on GPU and DRAM-PIM and code generation optimizations. Our solution achieves up to 35.2% (31.6% on average) speedup for all 1x1 convolutions for mobile CNN models against GPU.

Published in: IEEE Computer Architecture Letters ( Volume: 21, Issue: 2, 01 July-Dec. 2022)

Page(s): 33 - 36

Date of Publication: 13 June 2022

ISSN Information:

DOI: 10.1109/LCA.2022.3182363

Funding Agency:

Keywords assist with retrieval of results and provide a means to discovering other relevant content. Learn more.

Contents

1 Introduction

The explosive scaling of the computing power in recent decades has placed a spotlight on the main memory being an increasingly critical performance and energy bottleneck. Processing-in-Memory (PIM) provides promising solutions to this issue by placing computational logic in or near memory devices thus reducing or eliminating the data movement overheads [1], [2]. While the idea is not new [3], [4], with recent technology advances in-memory architectures, its potential as a viable solution is being actively explored in varying applications and commercial hardware contexts [5], [6], [7].

Keywords assist with retrieval of results and provide a means to discovering other relevant content. Learn more.

References is not available for this document.

Runtime Support for Accelerating CNN Models on Digital DRAM Processing-in-Memory Hardware

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

1 Introduction

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Runtime Support for Accelerating CNN Models on Digital DRAM Processing-in-Memory Hardware

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

1 Introduction

Authors

Figures

References

Citations

Keywords

Metrics

Footnotes

References

IEEE Account

Purchase Details

Profile Information

Need Help?