A pipeline for data analysis of canine exome-sequencing data

No Thumbnail Available

URL

Journal Title

Journal ISSN

Volume Title

School of Science | Master's thesis
Checking the digitized thesis and permission for publishing
Instructions for the author

Date

2013

Major/Subject

Informaatiotekniikka

Mcode

T-61

Degree programme

Language

en

Pages

63

Series

Abstract

Single nucleotide polymorphisms (SNPs), short INDELS and large structural variations have become a common source of genetic variations for several rare and common disorders. Whole-exome sequencing is a powerful application for identifying this large number of complex disease associated genetic variants in exons that contribute to the clinical diagnosis of the diseases. In this project, we focused on the genetic analyses of the canine disease heritage as a model for human genetic diseases. Illumina HiSeq 2000 was used for sequencing ten canine exome samples. Since, exome sequencing is a novel approach in canine genomics and there is not any existing literature on this methodology, the primary aim of this thesis is to develop a thorough and efficient analysis pipeline for paired-end sequencing data to reveal possible kinds of genetic polymorphisms. This developed exome-sequencing pipeline, incorporates several existing bioinformatics tools to perform several computational steps in the analysis of the data. These steps range from quality check of raw data to alignment, variant calling and annotation of the variants. The results from the pipeline gives a comprehensive set of information to the medical geneticists for further downstream analysis in discovering causative mutations in different projects.

Description

Supervisor

Lähdesmäki, Harri

Thesis advisor

Lohi, Hannes

Keywords

canine exome-sequencing, pipeline, NGS

Other note

Citation