Mapping quantized convolutional layers on the SiLago platform

University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

Abstract: Convolutional neural networks (CNNs) have been utilized in various applications, such as image classification, computer vision, etc. With development, the complexity and computation of CNNs also increase, which requires more memory and resources when deployed on devices, especially embedded systems. The most common approach to compress the CNN models is known as network quantization, converting floating-point numbers to fixed-point, which reduces the memory footprint and the computation amount. Many hardware frameworks were proposed to accelerate the inference of quantized neural networks. The SiLago platform is the hardware architecture proposed to address the issue of dark silicon and increase automation in VLSI design. It consists of two CGRAs that help to achieve high parallelism and ASIC-like efficiency. However, it only supported 16-bit datapath originally, which lacks the ability to process algorithms that use low-bitwidth data types, such as 4-bit and 8-bit. In this thesis, we extended the DPU module to support some extra modes for lowbitwidth data and modified the instruction set of Silago to configure different precision of the computation in DPUs. In addition, we proposed three mapping algorithms to map 4-bit, 8-bit and 16-bit convolutional layers of quantized CNNs on the Silago platform. The algorithms were implemented by writing instructions of SiLago ISA and validated on the hardware through simulations. Results show that we can map quantized convolutional layers on Silago with various precision.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)