Cargando…
Aligning the unalignable: bacteriophage whole genome alignments
BACKGROUND: In recent years, many studies focused on the description and comparison of large sets of related bacteriophage genomes. Due to the peculiar mosaic structure of these genomes, few informative approaches for comparing whole genomes exist: dot plots diagrams give a mostly qualitative assess...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4711071/ https://www.ncbi.nlm.nih.gov/pubmed/26757899 http://dx.doi.org/10.1186/s12859-015-0869-5 |
_version_ | 1782409913244319744 |
---|---|
author | Bérard, Sèverine Chateau, Annie Pompidor, Nicolas Guertin, Paul Bergeron, Anne Swenson, Krister M. |
author_facet | Bérard, Sèverine Chateau, Annie Pompidor, Nicolas Guertin, Paul Bergeron, Anne Swenson, Krister M. |
author_sort | Bérard, Sèverine |
collection | PubMed |
description | BACKGROUND: In recent years, many studies focused on the description and comparison of large sets of related bacteriophage genomes. Due to the peculiar mosaic structure of these genomes, few informative approaches for comparing whole genomes exist: dot plots diagrams give a mostly qualitative assessment of the similarity/dissimilarity between two or more genomes, and clustering techniques are used to classify genomes. Multiple alignments are conspicuously absent from this scene. Indeed, whole genome aligners interpret lack of similarity between sequences as an indication of rearrangements, insertions, or losses. This behavior makes them ill-prepared to align bacteriophage genomes, where even closely related strains can accomplish the same biological function with highly dissimilar sequences. RESULTS: In this paper, we propose a multiple alignment strategy that exploits functional collinearity shared by related strains of bacteriophages, and uses partial orders to capture mosaicism of sets of genomes. As classical alignments do, the computed alignments can be used to predict that genes have the same biological function, even in the absence of detectable similarity. The Alpha aligner implements these ideas in visual interactive displays, and is used to compute several examples of alignments of Staphylococcus aureus and Mycobacterium bacteriophages, involving up to 29 genomes. Using these datasets, we prove that Alpha alignments are at least as good as those computed by standard aligners. Comparison with the progressiveMauve aligner – which implements a partial order strategy, but whose alignments are linearized – shows a greatly improved interactive graphic display, while avoiding misalignments. CONCLUSIONS: Multiple alignments of whole bacteriophage genomes work, and will become an important conceptual and visual tool in comparative genomics of sets of related strains. A python implementation of Alpha, along with installation instructions for Ubuntu and OSX, is available on bitbucket (https://bitbucket.org/thekswenson/alpha). ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-015-0869-5) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-4711071 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-47110712016-01-14 Aligning the unalignable: bacteriophage whole genome alignments Bérard, Sèverine Chateau, Annie Pompidor, Nicolas Guertin, Paul Bergeron, Anne Swenson, Krister M. BMC Bioinformatics Methodology Article BACKGROUND: In recent years, many studies focused on the description and comparison of large sets of related bacteriophage genomes. Due to the peculiar mosaic structure of these genomes, few informative approaches for comparing whole genomes exist: dot plots diagrams give a mostly qualitative assessment of the similarity/dissimilarity between two or more genomes, and clustering techniques are used to classify genomes. Multiple alignments are conspicuously absent from this scene. Indeed, whole genome aligners interpret lack of similarity between sequences as an indication of rearrangements, insertions, or losses. This behavior makes them ill-prepared to align bacteriophage genomes, where even closely related strains can accomplish the same biological function with highly dissimilar sequences. RESULTS: In this paper, we propose a multiple alignment strategy that exploits functional collinearity shared by related strains of bacteriophages, and uses partial orders to capture mosaicism of sets of genomes. As classical alignments do, the computed alignments can be used to predict that genes have the same biological function, even in the absence of detectable similarity. The Alpha aligner implements these ideas in visual interactive displays, and is used to compute several examples of alignments of Staphylococcus aureus and Mycobacterium bacteriophages, involving up to 29 genomes. Using these datasets, we prove that Alpha alignments are at least as good as those computed by standard aligners. Comparison with the progressiveMauve aligner – which implements a partial order strategy, but whose alignments are linearized – shows a greatly improved interactive graphic display, while avoiding misalignments. CONCLUSIONS: Multiple alignments of whole bacteriophage genomes work, and will become an important conceptual and visual tool in comparative genomics of sets of related strains. A python implementation of Alpha, along with installation instructions for Ubuntu and OSX, is available on bitbucket (https://bitbucket.org/thekswenson/alpha). ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-015-0869-5) contains supplementary material, which is available to authorized users. BioMed Central 2016-01-13 /pmc/articles/PMC4711071/ /pubmed/26757899 http://dx.doi.org/10.1186/s12859-015-0869-5 Text en © Bérard et al. 2015 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Methodology Article Bérard, Sèverine Chateau, Annie Pompidor, Nicolas Guertin, Paul Bergeron, Anne Swenson, Krister M. Aligning the unalignable: bacteriophage whole genome alignments |
title | Aligning the unalignable: bacteriophage whole genome alignments |
title_full | Aligning the unalignable: bacteriophage whole genome alignments |
title_fullStr | Aligning the unalignable: bacteriophage whole genome alignments |
title_full_unstemmed | Aligning the unalignable: bacteriophage whole genome alignments |
title_short | Aligning the unalignable: bacteriophage whole genome alignments |
title_sort | aligning the unalignable: bacteriophage whole genome alignments |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4711071/ https://www.ncbi.nlm.nih.gov/pubmed/26757899 http://dx.doi.org/10.1186/s12859-015-0869-5 |
work_keys_str_mv | AT berardseverine aligningtheunalignablebacteriophagewholegenomealignments AT chateauannie aligningtheunalignablebacteriophagewholegenomealignments AT pompidornicolas aligningtheunalignablebacteriophagewholegenomealignments AT guertinpaul aligningtheunalignablebacteriophagewholegenomealignments AT bergeronanne aligningtheunalignablebacteriophagewholegenomealignments AT swensonkristerm aligningtheunalignablebacteriophagewholegenomealignments |