Low-power Implementation of Neural Network Extension for RISC-V CPU

University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

Author: Dario Lo Presti Costantino; [2023]

Keywords: Artificial intelligence; Deep learning; Neural networks; Edge computing; Convolutional neural networks; Low-power electronics; RISC-V; AI accelerators; Parallel processing; Artificiell intelligens; Deep learning; Neurala nätverk; Edge computing; konvolutionella neurala nätverk; Lågeffektelektronik; RISC-V; AI-acceleratorer; Parallell bearbetning;

Abstract: Deep Learning and Neural Networks have been studied and developed for many years as of today, but there is still a great need of research on this field, because the industry needs are rapidly changing. The new challenge in this field is called edge inference and it is the deployment of Deep Learning on small, simple and cheap devices, such as low-power microcontrollers. At the same time, also on the field of hardware design the industry is moving towards the RISC-V micro-architecture, which is open-source and is developing at such a fast rate that it will soon become the standard. A batteryless ultra low power microcontroller based on energy harvesting and RISC-V microarchitecture has been the final target device of this thesis. The challenge on which this project is based is to make a simple Neural Network work on this chip, i.e., finding out the capabilities and the limits of this chip for such an application and trying to optimize as much as possible the power and energy consumption. To do that TensorFlow Lite Micro has been chosen as the Deep Learning framework of reference, and a simple existing application was studied and tested first on the SparkFun Edge board and then successfully ported to the RISC-V ONiO.zero core, with its restrictive features. The optimizations have been done only on the convolutional layer of the neural network, both by Software, implementing the Im2col algorithm, and by Hardware, designing and implementing a new RISC-V instruction and the corresponding Hardware unit that performs four 8-bit parallel multiply-and-accumulate operations. This new design drastically reduces both the inference time (3.7 times reduction) and the number of instructions executed (4.8 times reduction), meaning lower overall power consumption. This kind of application on this type of chip can open the doors to a whole new market, giving the possibility to have thousands small, cheap and self-sufficient chips deploying Deep Learning applications to solve simple everyday life problems, even without network connection and without any privacy issue.

AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)

Low-power Implementation of Neural Network Extension for RISC-V CPU

Searchphrases right now

Popular searches

popular essays yesterday (2024-04-26)