Deploy Convolution Neural Networks on A-Core: a combined RISC-V processor and in-memory-computing accelerator

Thumbnail Image
Journal Title
Journal ISSN
Volume Title
Sähkötekniikan korkeakoulu | Master's thesis
Micro- and Nanoelectronic Circuit Design
Degree programme
Master’s Programme in Electronics and Nanotechnology (TS2013)
In recent years, smart devices have been an integral part of daily life and production. The implementation of artificial intelligence (AI) in smart devices brings remarkable performance in human-machine tasks such as data analysis and decision-making. However, due to the constrained resources, power and latency, it is a challenge to incorporate AI into smart devices containing micro-controllers. Researchers are exploring the design of specific hardware platforms to make smart devices more compatible with AI models. In this thesis, we propose a design to use an In-Memory Computing (IMC) module to execute convolution tasks. To evaluate the application of this module, we integrate the IMC module with a Reduced-Instruction-Set Computation (RISC) processor to perform Convolution Neural Network tasks. This integration is simulated within a System Development Kit(TheSyDeKick) framework for the "A-Core" project at Aalto University. Initially, a CNN model in Python is imported, revised and trained. To simulate the conversion between analog and digital domains in the IMC module, the parameters are quantized with bits-varying during the training and then added to inference tests. Meanwhile, the outputs are scaled from each layer in the inference tests and the final accuracy is compared with the accuracy of the network without quantization. Then we construct an equivalent CNN model in C and compile it in the RISC-V processor environment. During this process, we decided the bits of quantization for the optimized performance in the Python model and applied it to the equivalent C model in the RISC-V environment. In the quantization of weights and bias, we selected a clipping range of (-3.96875, 3.96875) and a resolution of 0.015625. In the scaling of outputs among layers, we set a clipping range of (-3.9375, 3.9375) and a resolution of 0.25. We also quantized the ’zero’ in the clipping range for the activation function of Rectified Linear Unit (ReLU) in Convolution and Fully Connected layers to improve the contrast of outputs. Overall, the manual tests show that a CNN model in C with 6 bits in quantization has outputs identical to the CNN model in Python, which is around 98%. Our experimental results demonstrate that the quantized CNN model based on the integration of the IMC module and RISC-V processor can achieve high accuracy as an equivalent CNN model purely in Python. It brings an accessible strategy to implement AI on resource-constrained devices.
Andraud, Martin
Thesis advisor
Numan, Omar
in-memory computing, quantization, convolution neural network, RISC-V, zero limit, convolution
Other note