Cargando…

Aligning the unalignable: bacteriophage whole genome alignments

BACKGROUND: In recent years, many studies focused on the description and comparison of large sets of related bacteriophage genomes. Due to the peculiar mosaic structure of these genomes, few informative approaches for comparing whole genomes exist: dot plots diagrams give a mostly qualitative assess...

Descripción completa

Detalles Bibliográficos
Autores principales: Bérard, Sèverine, Chateau, Annie, Pompidor, Nicolas, Guertin, Paul, Bergeron, Anne, Swenson, Krister M.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4711071/
https://www.ncbi.nlm.nih.gov/pubmed/26757899
http://dx.doi.org/10.1186/s12859-015-0869-5
_version_ 1782409913244319744
author Bérard, Sèverine
Chateau, Annie
Pompidor, Nicolas
Guertin, Paul
Bergeron, Anne
Swenson, Krister M.
author_facet Bérard, Sèverine
Chateau, Annie
Pompidor, Nicolas
Guertin, Paul
Bergeron, Anne
Swenson, Krister M.
author_sort Bérard, Sèverine
collection PubMed
description BACKGROUND: In recent years, many studies focused on the description and comparison of large sets of related bacteriophage genomes. Due to the peculiar mosaic structure of these genomes, few informative approaches for comparing whole genomes exist: dot plots diagrams give a mostly qualitative assessment of the similarity/dissimilarity between two or more genomes, and clustering techniques are used to classify genomes. Multiple alignments are conspicuously absent from this scene. Indeed, whole genome aligners interpret lack of similarity between sequences as an indication of rearrangements, insertions, or losses. This behavior makes them ill-prepared to align bacteriophage genomes, where even closely related strains can accomplish the same biological function with highly dissimilar sequences. RESULTS: In this paper, we propose a multiple alignment strategy that exploits functional collinearity shared by related strains of bacteriophages, and uses partial orders to capture mosaicism of sets of genomes. As classical alignments do, the computed alignments can be used to predict that genes have the same biological function, even in the absence of detectable similarity. The Alpha aligner implements these ideas in visual interactive displays, and is used to compute several examples of alignments of Staphylococcus aureus and Mycobacterium bacteriophages, involving up to 29 genomes. Using these datasets, we prove that Alpha alignments are at least as good as those computed by standard aligners. Comparison with the progressiveMauve aligner – which implements a partial order strategy, but whose alignments are linearized – shows a greatly improved interactive graphic display, while avoiding misalignments. CONCLUSIONS: Multiple alignments of whole bacteriophage genomes work, and will become an important conceptual and visual tool in comparative genomics of sets of related strains. A python implementation of Alpha, along with installation instructions for Ubuntu and OSX, is available on bitbucket (https://bitbucket.org/thekswenson/alpha). ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-015-0869-5) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4711071
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-47110712016-01-14 Aligning the unalignable: bacteriophage whole genome alignments Bérard, Sèverine Chateau, Annie Pompidor, Nicolas Guertin, Paul Bergeron, Anne Swenson, Krister M. BMC Bioinformatics Methodology Article BACKGROUND: In recent years, many studies focused on the description and comparison of large sets of related bacteriophage genomes. Due to the peculiar mosaic structure of these genomes, few informative approaches for comparing whole genomes exist: dot plots diagrams give a mostly qualitative assessment of the similarity/dissimilarity between two or more genomes, and clustering techniques are used to classify genomes. Multiple alignments are conspicuously absent from this scene. Indeed, whole genome aligners interpret lack of similarity between sequences as an indication of rearrangements, insertions, or losses. This behavior makes them ill-prepared to align bacteriophage genomes, where even closely related strains can accomplish the same biological function with highly dissimilar sequences. RESULTS: In this paper, we propose a multiple alignment strategy that exploits functional collinearity shared by related strains of bacteriophages, and uses partial orders to capture mosaicism of sets of genomes. As classical alignments do, the computed alignments can be used to predict that genes have the same biological function, even in the absence of detectable similarity. The Alpha aligner implements these ideas in visual interactive displays, and is used to compute several examples of alignments of Staphylococcus aureus and Mycobacterium bacteriophages, involving up to 29 genomes. Using these datasets, we prove that Alpha alignments are at least as good as those computed by standard aligners. Comparison with the progressiveMauve aligner – which implements a partial order strategy, but whose alignments are linearized – shows a greatly improved interactive graphic display, while avoiding misalignments. CONCLUSIONS: Multiple alignments of whole bacteriophage genomes work, and will become an important conceptual and visual tool in comparative genomics of sets of related strains. A python implementation of Alpha, along with installation instructions for Ubuntu and OSX, is available on bitbucket (https://bitbucket.org/thekswenson/alpha). ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-015-0869-5) contains supplementary material, which is available to authorized users. BioMed Central 2016-01-13 /pmc/articles/PMC4711071/ /pubmed/26757899 http://dx.doi.org/10.1186/s12859-015-0869-5 Text en © Bérard et al. 2015 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology Article
Bérard, Sèverine
Chateau, Annie
Pompidor, Nicolas
Guertin, Paul
Bergeron, Anne
Swenson, Krister M.
Aligning the unalignable: bacteriophage whole genome alignments
title Aligning the unalignable: bacteriophage whole genome alignments
title_full Aligning the unalignable: bacteriophage whole genome alignments
title_fullStr Aligning the unalignable: bacteriophage whole genome alignments
title_full_unstemmed Aligning the unalignable: bacteriophage whole genome alignments
title_short Aligning the unalignable: bacteriophage whole genome alignments
title_sort aligning the unalignable: bacteriophage whole genome alignments
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4711071/
https://www.ncbi.nlm.nih.gov/pubmed/26757899
http://dx.doi.org/10.1186/s12859-015-0869-5
work_keys_str_mv AT berardseverine aligningtheunalignablebacteriophagewholegenomealignments
AT chateauannie aligningtheunalignablebacteriophagewholegenomealignments
AT pompidornicolas aligningtheunalignablebacteriophagewholegenomealignments
AT guertinpaul aligningtheunalignablebacteriophagewholegenomealignments
AT bergeronanne aligningtheunalignablebacteriophagewholegenomealignments
AT swensonkristerm aligningtheunalignablebacteriophagewholegenomealignments