Cargando…

eCAMBer: efficient support for large-scale comparative analysis of multiple bacterial strains

BACKGROUND: Inconsistencies are often observed in the genome annotations of bacterial strains. Moreover, these inconsistencies are often not reflected by sequence discrepancies, but are caused by wrongly annotated gene starts as well as mis-identified gene presence. Thus, tools are needed for improv...

Descripción completa

Detalles Bibliográficos
Autores principales: Wozniak, Michal, Wong, Limsoon, Tiuryn, Jerzy
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4023553/
https://www.ncbi.nlm.nih.gov/pubmed/24597904
http://dx.doi.org/10.1186/1471-2105-15-65
_version_ 1782316564015480832
author Wozniak, Michal
Wong, Limsoon
Tiuryn, Jerzy
author_facet Wozniak, Michal
Wong, Limsoon
Tiuryn, Jerzy
author_sort Wozniak, Michal
collection PubMed
description BACKGROUND: Inconsistencies are often observed in the genome annotations of bacterial strains. Moreover, these inconsistencies are often not reflected by sequence discrepancies, but are caused by wrongly annotated gene starts as well as mis-identified gene presence. Thus, tools are needed for improving annotation consistency and accuracy among sets of bacterial strain genomes. RESULTS: We have developed eCAMBer, a tool for efficiently supporting comparative analysis of multiple bacterial strains within the same species. eCAMBer is a highly optimized revision of our earlier tool, CAMBer, scaling it up for significantly larger datasets comprising hundreds of bacterial strains. eCAMBer works in two phases. First, it transfers gene annotations among all considered bacterial strains. In this phase, it also identifies homologous gene families and annotation inconsistencies. Second, eCAMBer, tries to improve the quality of annotations by resolving the gene start inconsistencies and filtering out gene families arising from annotation errors propagated in the previous phase. CONCULSIONS: eCAMBer efficiently identifies and resolves annotation inconsistencies among closely related bacterial genomes. It outperforms other competing tools both in terms of running time and accuracy of produced annotations. Software, user manual, and case study results are available at the project website: http://bioputer.mimuw.edu.pl/ecamber.
format Online
Article
Text
id pubmed-4023553
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-40235532014-05-28 eCAMBer: efficient support for large-scale comparative analysis of multiple bacterial strains Wozniak, Michal Wong, Limsoon Tiuryn, Jerzy BMC Bioinformatics Methodology Article BACKGROUND: Inconsistencies are often observed in the genome annotations of bacterial strains. Moreover, these inconsistencies are often not reflected by sequence discrepancies, but are caused by wrongly annotated gene starts as well as mis-identified gene presence. Thus, tools are needed for improving annotation consistency and accuracy among sets of bacterial strain genomes. RESULTS: We have developed eCAMBer, a tool for efficiently supporting comparative analysis of multiple bacterial strains within the same species. eCAMBer is a highly optimized revision of our earlier tool, CAMBer, scaling it up for significantly larger datasets comprising hundreds of bacterial strains. eCAMBer works in two phases. First, it transfers gene annotations among all considered bacterial strains. In this phase, it also identifies homologous gene families and annotation inconsistencies. Second, eCAMBer, tries to improve the quality of annotations by resolving the gene start inconsistencies and filtering out gene families arising from annotation errors propagated in the previous phase. CONCULSIONS: eCAMBer efficiently identifies and resolves annotation inconsistencies among closely related bacterial genomes. It outperforms other competing tools both in terms of running time and accuracy of produced annotations. Software, user manual, and case study results are available at the project website: http://bioputer.mimuw.edu.pl/ecamber. BioMed Central 2014-03-05 /pmc/articles/PMC4023553/ /pubmed/24597904 http://dx.doi.org/10.1186/1471-2105-15-65 Text en Copyright © 2014 Wozniak et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited.
spellingShingle Methodology Article
Wozniak, Michal
Wong, Limsoon
Tiuryn, Jerzy
eCAMBer: efficient support for large-scale comparative analysis of multiple bacterial strains
title eCAMBer: efficient support for large-scale comparative analysis of multiple bacterial strains
title_full eCAMBer: efficient support for large-scale comparative analysis of multiple bacterial strains
title_fullStr eCAMBer: efficient support for large-scale comparative analysis of multiple bacterial strains
title_full_unstemmed eCAMBer: efficient support for large-scale comparative analysis of multiple bacterial strains
title_short eCAMBer: efficient support for large-scale comparative analysis of multiple bacterial strains
title_sort ecamber: efficient support for large-scale comparative analysis of multiple bacterial strains
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4023553/
https://www.ncbi.nlm.nih.gov/pubmed/24597904
http://dx.doi.org/10.1186/1471-2105-15-65
work_keys_str_mv AT wozniakmichal ecamberefficientsupportforlargescalecomparativeanalysisofmultiplebacterialstrains
AT wonglimsoon ecamberefficientsupportforlargescalecomparativeanalysisofmultiplebacterialstrains
AT tiurynjerzy ecamberefficientsupportforlargescalecomparativeanalysisofmultiplebacterialstrains