Optimizing single-cell RNA sequencing atlas integration: A pipeline approach
No Thumbnail Available
URL
Journal Title
Journal ISSN
Volume Title
Kemian tekniikan korkeakoulu |
Master's thesis
Authors
Date
2024-08-29
Department
Major/Subject
Biological and Chemical Engineering
Mcode
Degree programme
Master's Programme in Biological and Chemical Engineering for a Sustainable Bioeconomy
Language
en
Pages
75
Series
Abstract
Single RNA-sequencing (scRNA-seq) has revolutionized our understanding of cellular diversity across various organisms and models. With the exponential rise in scRNA-seq data, the need for effective integration has become essential. However, challenges such as batch effects, computational limitations, and the lack of consensus on integration methodologies complicate the process. This thesis addresses these limitations to create a comprehensive neurodevelopmental atlas of fetal, 2D- and 3D-derived induced pluripotent stem cell (iPSC) neuronal populations. To achieve this atlas, this work evaluates and benchmarks various state-of-the-art integration tools to determine the most effective strategies for data integration while maintaining biological integrity using neuronal cell datasets from Velmeshev et al. 2023. We propose a combined pipeline that leverages both scVI and scPoli generative model methods to effectively mitigate batch effects while preserving biological signals. The final atlas resulted in well-integrated datasets with accurate cell type annotations, enabling downstream analysis on cell type composition in our neuronal differentiation protocol (Shi et al. 2012). Finally, this integration also allowed for the comparison of in vitro iPSC models to in vivo fetal data in terms gene expression, thus enhancing our understanding of the applicability of the models.Description
Supervisor
Lähdesmäki, HarriThesis advisor
Kilpinen, HelenaPuigdevall, Pau
Keywords
single-cell RNA-sequencing, data integration, batch effects, induced pluripotent stem cells, benchmarking, neurodevelopment