Algebraic Aspects of Hidden Variable Models

Loading...
Thumbnail Image
Journal Title
Journal ISSN
Volume Title
School of Science | Doctoral thesis (article-based) | Defence date: 2023-08-11
Date
2023
Major/Subject
Mcode
Degree programme
Language
en
Pages
81 + app. 119
Series
Aalto University publication series DOCTORAL THESES, 113/2023
Abstract
Hidden variables are random variables that we cannot observe in reality but they are important for understanding the phenomenon of our interest because they affect the observable variables. Hidden variable models aim to represent the effect of the presence of hidden variables which are theoretically thought to exist but we have no data on them. In this thesis, we focus on two hidden variable models in phylogenetics and statistics.   In phylogenetics, we seek answers to two important questions related to modeling evolution. First, we study the embedding problem in the group-based models and the strand symmetric model and its higher order generalizations. In Publication I, we provide some embeddability criteria in the group-based models equipped with certain labeling. In Publication III, we characterize the embeddability in the strand symmetric model. These results allow us to measure approximately the proportion of the set of embeddable Markov matrices within the space of Markov matrices. These results generalize the previously established embeddability results on the Jukes-Cantor and Kimura models. The second question of our interest concerns with the distinguishability of phylogenetic network models which is related to the notion of generic identifiability. In Publication II, we provide some conditions on the network topology that ensure the distinguishability of their associated phylogenetic network models under some group-based models.   The last part of this thesis is dedicated to studying the factor analysis model which is a statistical model that seeks to reduce a large number of observable variables into a fewer number of hidden variables. The factor analysis model assumes that the observed variables can be presented as a linear combination of the hidden variables together with some error terms. Moreover, the observed and the hidden variables together with the error terms are assumed to be Gaussian. We generalize the factor analysis model by dropping the Gaussianity assumption and introduce the higher order factor analysis model. In Publication IV, we provide the dimension of the higher order factor analysis model and present some conditions under which the model has positive codimension.
Description
Supervising professor
Kubjas, Kaie, Assistant Prof., Aalto University, Department of Mathematics and Systems Analysis, Finland
Thesis advisor
Kubjas, Kaie, Assistant Prof., Aalto University, Department of Mathematics and Systems Analysis, Finland
Keywords
hidden variable model, the embedding problem, phylogenetic model, factor analysis model
Other note
Parts
  • [Publication 1]: Muhammad Ardiyansyah, Dimitra Kosta, and Kaie Kubjas. The Model-Specific Markov Embedding Problem for Symmetric Group-Based Models. Journal of Mathematical Biology, Volume 83, Article 33, September 2021.
    Full text in Acris/Aaltodoc: http://urn.fi/URN:NBN:fi:aalto-202109229318
    DOI: 10.1007/s00285-021-01656-5 View at publisher
  • [Publication 2]: Muhammad Ardiyansyah. Distinguishing Level-2 Phylogenetic Networks Using Algebraic and Combinatorial Techniques. Submitted to Acta Biotheoretica, June 2022.
  • [Publication 3]: Muhammad Ardiyansyah, Dimitra Kosta, and Jordi Roca-Lacostena. Embeddability of Centrosymmetric Matrices Capturing The Double-Helix Structure in Natural and Synthetic DNA. Journal of Mathematical Biology, Volume 86, Article 69, April 2023.
    Full text in Acris/Aaltodoc: http://urn.fi/URN:NBN:fi:aalto-202304262836
    DOI: 10.1007/s00285-023-01895-8 View at publisher
  • [Publication 4]: Muhammad Ardiyansyah and Luca Sodomaco. Dimension of Higher Order Factor Analysis Models. Accepted for publication in Algebraic Statistics, June 2023.
Citation