A Comparative Study of Classical and Quantum Transformer Models and Their Applications
No Thumbnail Available
Files
Shah_Jayaditya_2024.pdf (2.15 MB) (opens in new window)
Aalto login required (access for Aalto Staff only).
URL
Journal Title
Journal ISSN
Volume Title
Perustieteiden korkeakoulu |
Bachelor's thesis
Electronic archive copy is available locally at the Harald Herlin Learning Centre. The staff of Aalto University has access to the electronic bachelor's theses by logging into Aaltodoc with their personal Aalto user ID. Read more about the availability of the bachelor's theses.
Unless otherwise stated, all rights belong to the author. You may download, display and print this publication for Your own personal use. Commercial use is prohibited.
Authors
Date
2024-09-29
Department
Major/Subject
Quantum Technology
Mcode
SCI3103
Degree programme
Aalto Bachelor’s Programme in Science and Technology
Language
en
Pages
36+20
Series
Abstract
Transformers are a neural network architecture that enable the utilisation of a larger context frame than traditional neural networks when training deep learning models. With the advent of the ongoing decade, transformers have enabled the application of large language models, improved computer vision, generative models and other large-scale artificial intelligence systems. This thesis investigates the potential of quantum transformers to implement deep learning tasks focusing on the Quixer model, developed by Quantinuum, by comparing a classically simulated version of Quixer to classical transformers. The thesis is motivated by the need to optimize transformer components for large-scale applications, to address the quadratic complexity arising from the self attention mechanism. The results of the thesis indicate that Quixer performs in line with the classical baseline as published in the paper by Quantinuum when reproduced with the same dataset, and model performance follows the trend for another dataset of twice the size. Hence, providing a proof of concept quantum transformers can be considered an effective method for developing large-scale models in the future with the eventual improvement of quantum hardware.Description
Supervisor
Raasakka, MattiThesis advisor
Raasakka, MattiKeywords
machine learning, physics, quantum circuits, quantum transformers, quantum information, transformer architecture