Abstract:
Convolutional Neural Networks (CNNs) have become highly accurate and are extensively used for image identification. However, implementing CNNs in hardware creates a subst...Show MoreMetadata
Abstract:
Convolutional Neural Networks (CNNs) have become highly accurate and are extensively used for image identification. However, implementing CNNs in hardware creates a substantial challenge due to the proliferation of deep learning applications. Therefore, optimizing hardware design for efficient CNN acceleration is crucial. A vital element of CNN accelerator architecture is the processing element (PE) responsible for carrying out the convolution operation. In this regard, a modified PE design is proposed in this paper aiming to reduce hardware utilization and power consumption. To accomplish this, the paper recommends replacing bulky MAC units and conventional adder trees with a Modified Booth Encoding (MBE) multiplier and WALLACE tree(WT) based adders, respectively. Additionally, the proposed multiplier architecture utilizes XOR MUX full adders instead of conventional full adders. The XOR MUX full adder design requires more logic size than the conventional full adder, but it is more area and power efficient. Verilog HDL was used to implement the suggested design and synthesized on Xilinx Zynq FPGA. The design was compared with conventional approaches in terms of power consumption, area, and delay. The outcome showed that the proposed design outperformed the conventional design, achieving a reduction in area, delay, and power consumption.
Date of Conference: 29 February 2024 - 03 March 2024
Date Added to IEEE Xplore: 16 May 2024
ISBN Information: