A General Purpose Near Data Processing Architecture Optimized for Data-intensive Applications

University essay from Lunds universitet/Institutionen för elektro- och informationsteknik

Author: Xingda Li; Haidi Hu; [2023]

Keywords: Technology and Engineering;

Abstract: In recent years, as Internet of Things (IoT) and machine learning technologies have advanced, there has been increasing interest in the study of energy-efficient and flexible architectures for embedded systems. To bridge the performance gap between microprocessors and memory systems, Near-Data Processing (NDP) was introduced. Although some works have implemented NDP, few of them utilize the microprocessor’s cache memory. In this thesis, we present an NDP architecture that integrates static random access memory (SRAM), which is regarded as the L2 cache of a microcontroller unit (MCU). The proposed NDP is tailored for data-intensive applications and seeks to address multiple problems. A coarse-grained reconfigurable array (CGRA)-based strategy is utilized to maximize flexibility while decreasing power consumption. Additionally, numerous approaches, such as convolution-and-pooling-integrated computation, two-level clock gating, etc., are implemented to improve energy efficiency even more. The design was constructed utilizing STMicroelectronics (STM) 65 nm Low Power Low VT (LPLVT) technology with a maximum clock rate of 167 MHz. Two popular algorithms, the convolutional neural network (CNN) and K-means, were mapped onto the hardware to evaluate it. As a result, the power efficiency of CNN and K-means algorithms can be boosted by 12x and 26x relative to field-programmable gate array (FPGA) and MCU implementations, respectively, and by several orders of magnitude relative to other K-Means accelerators.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)