A Study of Sex-Specific Proteome in Multi-Ethnic Study of Atherosclerosis (MESA)

dc.contributorAalto-yliopistofi
dc.contributorAalto Universityen
dc.contributor.advisorHartonen, Tuomo
dc.contributor.authorZhu, Meihui
dc.contributor.schoolPerustieteiden korkeakoulufi
dc.contributor.schoolSchool of Scienceen
dc.contributor.supervisorMarttinen, Pekka
dc.date.accessioned2025-01-30T18:02:52Z
dc.date.available2025-01-30T18:02:52Z
dc.date.issued2024-12-31
dc.description.abstractProtein expression levels play a crucial role in shaping biological phenotypes. Given the sexual dimorphism between genetic females (XX) and genetic males (XY), significant differences in protein expression are expected but remain under explored. This thesis, as part of a larger collaborative project, aims to replicate sex-specific proteomic patterns identified in the UK Biobank (UKB) and evaluate the transferability of a proteomic sex prediction model. It explores the consistency of these findings across diverse ethnic groups and investigates age-related changes in sex-bias estimates and ProtSexIndex (the deviation between genetic sex and proteomic sex) using the longitudinal data from the MESA cohort. Extensive preprocessing and quality control steps were undertaken to ensure robust and reliable results. Building on logistic regression methods to identify protein sex-bias in UKB and an XGBoost model trained to estimate proteomic sex, this thesis evaluates the model’s transferability through two Reciprocal Generalizability Tests and extends the analysis to explore cohort-specific differences and temporal dynamics. The UKB-trained model achieved high performance in MESA (AUC = 0.9938, 95% CI [0.9903, 0.9967]). Despite this, cohort-specific differences inherent to the relative nature of NPX values introduced calibration challenges. Longitudinal analyses demonstrated stable sex-bias estimates across exams (Spearman Correlation > 0.95, p < 1e-15), accompanied by a significant reduction in the magnitude of sex differences with aging (923 out of 2917 proteins, z-test). ProtSexIndex increased significantly with age for both males (t-statistic = -3.75, p = 0.0002) and females (t-statistic = -6.38, p < 1e-9), reflecting the dynamic nature of proteomic sex. Multi-ethnicity analyses highlighted consistent sex-bias estimates and robust model performance across racial groups, underscoring the generalizability of findings.en
dc.format.extent71
dc.format.mimetypeapplication/pdfen
dc.identifier.urihttps://aaltodoc.aalto.fi/handle/123456789/133936
dc.identifier.urnURN:NBN:fi:aalto-202501302219
dc.language.isoenen
dc.programmeMaster's Programme in Computer, Communication and Information Sciencesen
dc.programme.majorMachine Learning, Data Science and Artificial Intelligenceen
dc.subject.keywordproteomicsen
dc.subject.keywordsex differenceen
dc.subject.keywordXGBoosten
dc.subject.keywordagingen
dc.subject.keywordethnicityen
dc.subject.keywordtransferabilityen
dc.titleA Study of Sex-Specific Proteome in Multi-Ethnic Study of Atherosclerosis (MESA)en
dc.typeG2 Pro gradu, diplomityöfi
dc.type.ontasotMaster's thesisen
dc.type.ontasotDiplomityöfi
local.aalto.electroniconlyyes
local.aalto.openaccessyes

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
master_Zhu_Meihui_2025.pdf
Size:
5.46 MB
Format:
Adobe Portable Document Format