Mandrake: visualizing microbial population structure by embedding millions of genomes into a low-dimensional representation
Access rights
openAccess
publishedVersion
URL
Journal Title
Journal ISSN
Volume Title
A1 Alkuperäisartikkeli tieteellisessä aikakauslehdessä
This publication is imported from Aalto University research portal.
View publication in the Research portal (opens in new window)
View/Open full text file from the Research portal (opens in new window)
Other link related to publication (opens in new window)
View publication in the Research portal (opens in new window)
View/Open full text file from the Research portal (opens in new window)
Other link related to publication (opens in new window)
Date
2022-10-10
Department
Major/Subject
Mcode
Degree programme
Language
en
Pages
12
Series
Philosophical transactions of the Royal Society of London. Series B, Biological sciences, Volume 377, issue 1861, pp. 1-12
Abstract
In less than a decade, population genomics of microbes has progressed from the effort of sequencing dozens of strains to thousands, or even tens of thousands of strains in a single study. There are now hundreds of thousands of genomes available even for a single bacterial species, and the number of genomes is expected to continue to increase at an accelerated pace given the advances in sequencing technology and widespread genomic surveillance initiatives. This explosion of data calls for innovative methods to enable rapid exploration of the structure of a population based on different data modalities, such as multiple sequence alignments, assemblies and estimates of gene content across different genomes. Here, we present Mandrake, an efficient implementation of a dimensional reduction method tailored for the needs of large-scale population genomics. Mandrake is capable of visualizing population structure from millions of whole genomes, and we illustrate its usefulness with several datasets representing major pathogens. Our method is freely available both as an analysis pipeline (https://github.com/johnlees/mandrake) and as a browser-based interactive application (https://gtonkinhill.github.io/mandrake-web/). This article is part of a discussion meeting issue 'Genomic population structures of microbial pathogens'.Description
Keywords
dimensional reduction, genomics, pathogens, population structure, visualization
Other note
Citation
Lees, J A, Tonkin-Hill, G, Yang, Z & Corander, J 2022, ' Mandrake: visualizing microbial population structure by embedding millions of genomes into a low-dimensional representation ', Philosophical transactions of the Royal Society of London. Series B, Biological sciences, vol. 377, no. 1861, 20210237, pp. 1-12 . https://doi.org/10.1098/rstb.2021.0237