[dipl] Perustieteiden korkeakoulu / SCI
Permanent URI for this collectionhttps://aaltodoc.aalto.fi/handle/123456789/21
Browse
Browsing [dipl] Perustieteiden korkeakoulu / SCI by Degree programme/Major subject "Bioinformaatioteknologia"
Now showing 1 - 1 of 1
- Results Per Page
- Sort Options
- Human leukocyte antigen (HLA) genotyping from next-generation sequencing data – a comparison of existing methods
Perustieteiden korkeakoulu | Master's thesis(2016-04-28) Kilpeläinen, ElinaThe major histocompatibility complex (MHC) codes for a variety of immunologically important genes. The most well-known are the human leukocyte antigen (HLA) genes which code for receptors that present antigens to lymphocytes. Thus, HLA molecules help to induce an immune response towards pathogenic factors. Due to their role in separating self from non-self structures, HLAs are crucial in clinical tissue transfers. In fact, differences in HLAs between the donor and the patient are known to cause graft rejection and graft-versus-host disease. This means that transfers can be performed only between HLA matched donor-patient pairs. Currently, HLA typing is performed on a subset of HLA genes. Despite these efforts, a successful treatment result cannot be guaranteed. This is likely due to the fact that many other immunologically relevant genes are left uncharacterized. Next-generation sequencing (NGS) could produce more comprehensive data to support HLA typing. However, analysis of NGS data derived from the highly polymorphic and repetitive MHC/HLA locus is not easy. This means that the typical read mapping and variant calling approaches used for standard NGS data analysis cannot be utilized. To tackle these issues, several pipelines designed solely for the purpose of HLA genotyping from NGS data have been developed. All of these programs utilize a wide reference HLA allele set against which the NGS data is analyzed. Typically, HLA genotype calling is either based on the observation of read-to-reference alignments or the construction of longer read contigs followed by comparison to the reference panel. The aim of this work was to investigate available analysis options for HLA typing from NGS data and to test such solutions with targeted NGS data. Specifically, we tested four open-source pipelines two of which were based on read assembly (HLAreporter, ATHLATES), and the other two on read mapping (HLAssign, OptiType). In addition, a commercial program (Omixon Target) was tested. The programs were evaluated by comparing the generated predictions to known HLA genotypes. Additionally, a majority vote of all the predictions at each locus was constructed in order to gain high-confidence HLA genotype calls despite program-wise errors. We conclude that even though the programs performed well, not one of them was error-free. Thus, HLA typing from NGS data is not flawless and care should be taken in interpreting the results and choosing the program(s) to be used.