Optimizing Transformer Inference on FPGA: A Study on Hardware Acceleration using Vitis HLS
Loading...
Journal Title
Journal ISSN
Volume Title
Sähkötekniikan korkeakoulu |
Master's thesis
Unless otherwise stated, all rights belong to the author. You may download, display and print this publication for Your own personal use. Commercial use is prohibited.
Author
Date
2023-08-21
Department
Major/Subject
Micro and Nanoelectronic Circuit Design
Mcode
ELEC3036
Degree programme
Master’s Programme in Electronics and Nanotechnology (TS2013)
Language
en
Pages
61
Series
Abstract
In the last decade, advancements in Natural Language Processing have been exemplary, it has reshaped human-computer interaction, spruced up mainly by the transformative power of deep learning models like the Transformer architecture. With its revolutionary self-attention mechanism, the Transformer has outperformed & outmanoeuvred traditional architectures, enhancing tasks varying from machine translation to sentiment analysis. However, the computational demands of these models challenge their integration onto devices with limited resources. This thesis strives to propose an FPGA-based hardware accelerator tailor made for the Transformer's encoder block, implemented using the Vitis High-Level Synthesis (HLS) framework. In this work, we systematically analyze the Transformer to pinpoint computational bottlenecks with alacrity. Through the Vitis HLS framework, the accelerator emphasizes parallelism, resource efficiency, and optimized memory access. Significantly, this approach employs HLS optimization to boost performance to stellar levels. A key contribution is the seamless integration of this accelerator with the Xilinx ecosystem, enriching its deployment on FPGA devices. We subject the proposed accelerator to rigorous & intensive testing, benchmarking its performance, optimum resource utilization, and energy efficiency. The results underscore the accelerator's potential in bridging the computational gap in resource-limited settings, establishing a benchmark for future NLP hardware acceleration endeavors.Description
Supervisor
Andraud, MartinThesis advisor
Adam, KazybekLeslin, Jelin
Keywords
transformer, hardware accelerator, selfAattention, high level synthesis, natural language processing, deep learning