In the last decade, advancements in Natural Language Processing have been exemplary, it has reshaped human-computer interaction, spruced up mainly by the transformative power of deep learning models like the Transformer architecture. With its revolutionary self-attention mechanism, the Transformer has outperformed & outmanoeuvred traditional architectures, enhancing tasks varying from machine translation to sentiment analysis. However, the computational demands of these models challenge their integration onto devices with limited resources. This thesis strives to propose an FPGA-based hardware accelerator tailor made for the Transformer's encoder block, implemented using the Vitis High-Level Synthesis (HLS) framework.
In this work, we systematically analyze the Transformer to pinpoint computational bottlenecks with alacrity. Through the Vitis HLS framework, the accelerator emphasizes parallelism, resource efficiency, and optimized memory access. Significantly, this approach employs HLS optimization to boost performance to stellar levels. A key contribution is the seamless integration of this accelerator with the Xilinx ecosystem, enriching its deployment on FPGA devices.
We subject the proposed accelerator to rigorous & intensive testing, benchmarking its performance, optimum resource utilization, and energy efficiency. The results underscore the accelerator's potential in bridging the computational gap in resource-limited settings, establishing a benchmark for future NLP hardware acceleration endeavors.