# Aaltodoc

Aaltodoc is the institutional repository of Aalto University.

Aaltodoc has a new updated appearance. Instructions for searching and more information is available here.

## Communities in Aaltodoc

Select a community to browse its collections.

Now showing 1 - 13 of 13

- Yliopistossa suoritettujen opintojen harjoitus- ja lopputöitä / Coursework, term papers and final projects completed at the university
- Avoimia oppimateriaaleja / Open educational resources
- Yliopiston yksiköiden vuosikertomuksia / Annual reports of the university's units
- Yliopiston yksiköissä toteutettujen hankkeiden väli- ja loppuraportteja sekä tieteellisiä kirjoja / Interim and final reports from projects carried out within the university's units, also scientific books
- Yliopiston järjestämien konferenssien kokoomateoksia / Conference proceedings of the university's events
- Yliopiston yksiköiden julkaisemia avoimia tieteellisiä verkkojulkaisuja / Open access journals published by the university’s units
- Rinnakkaistallennettuja artikkeleita / Green open access articles
- Yliopiston tutkimustietojärjestelmään tallennetut avoimet julkaisut sekä EU-rahoitteisten projektien tutkimustuotokset / Open access publications deposited in the university’s research information system, as well as research outputs from EU-funded projects

## Recent Submissions

Best nonnegative rank-2 matrix
approximations

(2024-06-18) Lindy, Etna; Sodomaco, Luca; Perustieteiden korkeakoulu; Kubjas, Kaie

A nonnegative rank-$r$ matrix factorization (NMF) of $X \in \RRp^{m\times n}$ takes the form $X= WH$, where $W \in \RRp^{m\times r}$, $H \in \RRp^{r\times n}$, and the matrices $W$ and $H$ are nonnegative. The nonnegative rank of the matrix $X$, denoted by $\nrnk{X}$, is the smallest $r$ for which such a decomposition exists.
When the nonnegative matrix $X$ has rank two, we have $\rnk{X} = \nrnk{X}$.
In the NMF-$r$ problem, one approximates a given datamatrix $U\in \RRp^{m\times n}$ with a matrix of nonnegative rank-$r$. The best approximation is the closest point among the nonnegative rank-$r$ matrices.
The NMF problem has a variety of applications in data analysis, including image processing, data mining, and signal processing.
The nonnegativity of the components $W$ and $H$ naturally highlights repeating features from the data, since the terms in the product cannot cancel out. \newline
In this thesis, I inspect the NMF-2 problem for small matrices of size $3\times 4$ and $4\times 4$.
The problem is approached algebraically to ensure the optimality of the solution: when the number of critical points (Euclidean distance degree) is known in advance, one may ensure that all the critical points have been successfully found in the numerical computations.
We redefine the NMF-2 problem as multiple rank-2 optimization problems with forced zeros.
The zeros of the matrix are called the zero pattern.\newline
The NMF-2 problem has been previously studied, and a connection between the rank-2 optimum and the NMF-2 optimum through sampling matrices of size $3\times 3$ was witnessed.
Furthermore, some zero patterns were never encountered as the optimum in the sampling.
In this thesis, I continue the experiments for $3\times 4$ and $4\times 4$ matrices to see whether these observations still hold in the larger setting. \newline
To find the best nonnegative rnak-2 matrix approximation, we first identify the relevant zero patterns, compute the ED degrees for the corresponding algebraic varieties, and finally solve the problem numerically for uniformly random integer matrices.
We were able to characterize the relevant zero patterns to three types of patterns, and prove that no other patterns may occur as an optimum in the $3\times 3$, $3\times 4$, and $4\times 4$ cases.
We also investigated further the connection between the negative entries of the rank-2 optimum and the optimum pattern, but based on the numerical results, the optimum pattern cannot be directly read from the negative entries of the rank-2 optimum as witnessed in the sampling of $3\times 3$ matrices.

Side-Channel Attacks in Digital Forensics

(2024-06-18) Mäki, Teemu; Alpírez Bock, Estuardo; Helanti, Lassi; Perustieteiden korkeakoulu; Hollanti, Camilla

Side-channel attacks target the physical implementation of a cryptosystem, exploiting information leaked by the execution of an algorithm. Side-channels arise from variations in timing, power consumption, electromagnetic emanations, and other measurable properties that correlate with the secret data being processed. Side-channel attacks present a viable technique for digital forensics, as cryptography becomes more common in electronic devices.
Bleichenbacher's fast Fourier transform-based solution to the hidden number problem is a powerful attack against the Elliptic Curve Digital Signature Algorithm (ECDSA). The attack assumes that a few bits of a secret nonce, a random one-time value used by the signature generation algorithm, can be recovered through side-channel analysis or other methods. The attack can recover the entire secret signing key from even a single bit of leakage over a large number of signatures.
This thesis presents side-channel attacks in the three primary physical domains: timing, power consumption, and electromagnetic emanations. Then, we present a thorough mathematical analysis of Bleichenbacher's attack to demonstrate how methods from algebra and other areas of mathematics can be applied to perform sophisticated key-recovery attacks.
The crucial range reduction phase of Bleichenbacher's attack involves finding sparse linear combinations of linear congruences Ki = Ci d + Hi (mod n) such that the Ci are minimized. We prove that the upper bound for the Ci can be relaxed by a factor of 8/5 compared to previous works. This reduces the requirements for range reduction, the most demanding part of the attack.
We demonstrated Bleichenbacher's attack using the relaxed bound in practice against a 192-bit ECDSA target implementation. We recovered 4 bits of the secret nonces using a Gaussian classifier, commonly known as a template attack, trained with a known signing key. We recovered the entire signing key from the 4-bit leak using 9,425 classified signatures out of a total of 27,000 signatures. The number of signatures is feasible to obtain in many practical scenarios.

On the quality of mathematical writing
produced by ChatGPT and Gemini

(2024-06-18) Aho, Meri; Ilmonen, Pauliina; Perustieteiden korkeakoulu; Ilmonen, Pauliina

Large Language Models (LLMs) have gained popularity in recent years, particularly after the release of ChatGPT, which is an LLM created by OpenAI. LLMs generate human-like text according to a given prompt. LLMs use Natural Language Processing (NLP) to generate the text.
In this thesis, we look into two LLMs, ChatGPT and Google Gemini, and investigate the quality of mathematical text generated by both of the models. The research questions of this thesis are the following three questions: 1) Can ChatGPT and Gemini write good mathematical text? 2) Can they write sensible references? and 3) Which LLM is better for these two tasks? This is done by asking both ChatGPT and Gemini to write abstracts on 25 statistical terms and give references to the texts. A student and a professor from Aalto University then assess the quality of writing in terms of mathematical accuracy. The results are further analysed by performing sign tests. Additionally, the LLMs are asked to give sources in the texts, which are then checked for accuracy. The two LLMs are then given a normalized score based on the amount of mistakes made in the references. A sign test and a t-test for paired observations are then conducted for assessing the difference of these scores.
We discover that both ChatGPT and Gemini generate quite poor quality of mathematical text. Although the text is well written, it lacks accuracy and mathematical correctness and also sometimes chooses quite goofy words, considering the type of text being written. The references are often riddled with mistakes or they flat out do not exist. Although both ChatGPT and Gemini were found out to be bad in both producing mathematical text and producing references, ChatGPT was found out to be better than Gemini in producing references, but Gemini was found out to be slightly better in producing mathematical text according to both the student and the professor.

Diffuse Optical Tomography with an Inaccurate Forward Operator in the Linearized Inverse Problem

(2024-06-18) Hakula, Aada; Hirvi, Pauliina; Perustieteiden korkeakoulu; Hyvönen, Nuutti

This thesis studies linearized inverse problems with an inaccurate forward operator. The inverse problems are considered in the Bayesian framework.
We use principal component analysis (PCA) to obtain a Bayesian model where the inaccuracy of the forward operator is included, and introduce two different algorithms for optimizing it: an alternating expectation-maximization algorithm and gradient descent. The former one aims to estimate both the primary unknown and the operator inaccuracy, and the latter one is applied to the posterior that is already marginalized with respect to the operator inaccuracy.
We apply the methods to the ill-posed, high-dimensional difference image reconstruction problem of diffuse optical tomography (DOT). In this thesis, the focus lies on the functional brain imaging of neonates. Our primary goal is to reconstruct the activity-related absorption changes in a neonatal brain, and our secondary goal is to obtain an approximation for the inaccurate forward operator, which in the context of brain imaging is the sensitivity profile, i.e., the Jacobian matrix of the measurements with respect to absorption for the considered neonate.
We first tested the two algorithms using a simple validation example. Both algorithms managed to reconstruct the underlying signal with reasonable accuracy. However, due to the relative slowness of the gradient descent algorithm compared to the expectation-maximization algorithm, we decided to test only the latter one on our imaging problem of DOT.
The performance of the expectation-maximization algorithm on the simulated imaging problem was promising. The algorithm was able to improve the initial reconstruction of the absorption change. The perturbation area was confined better, and there was also less noise in the background of the final reconstruction compared to the initial one with a mean Jacobian. However, the contrast was weaker in the final reconstruction, and the unknown forward operator could not be reconstructed with reasonable accuracy. Overall, the behavior of the expectation-maximization algorithm was very sensitive to, e.g., the location and contrast of the perturbation, the properties of noise, and the regularization of the principal components. Nevertheless, the results were promising and the algorithm has potential for further studies.

Automated optimisation workflow for radiotherapy using dose mimicking from
deep learning predicted dose

(2024-06-18) Myllymäki, Saku; Bolard, Gregory; Perustieteiden korkeakoulu; Oliveira, Fabricio

Intensity modulated radiotherapy (IMRT) and volumetric modulated arc therapy
(VMAT) are widespread optimisation frameworks used in treatment plan generation
for cancer patients today. Practitioners commonly spend a significant amount of
time balancing optimisation objectives to produce clinically acceptable treatment
plans. Recent work has proved machine learning predicted dose distributions used in
conjunction with dose mimicking or structure objectives derived from the predicted
dose to be able to produce high quality treatment plans.
In this thesis, the aim was to use MVision’s deep learning predicted dose distri-
butions and evaluate different optimisation strategies for producing dose distributions
similar in quality. Pure dose mimicking approaches, an approach using dose derived
structure objectives in addition to hybrid dose mimicking and structure objective
approaches were explored.
For the experiments, MatRad, an open source radiotherapy treatment planning toolkit
was used. The toolkit provides a pencil beam dose influence matrix calculation
algorithm and an interior point method optimiser package, which were used for dose
optimisation. Voxel mimicking approaches were generally not able to mimic the pre-
dicted dose distributions and the optimisation dose distributions were of significantly
lower quality. No optimisation method was able to consistently reproduce the quality
of the predicted dose, but the structure based objectives showed most potential.