Cargando…
BG7: A New Approach for Bacterial Genome Annotation Designed for Next Generation Sequencing Data
BG7 is a new system for de novo bacterial, archaeal and viral genome annotation based on a new approach specifically designed for annotating genomes sequenced with next generation sequencing technologies. The system is versatile and able to annotate genes even in the step of preliminary assembly of...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2012
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3504008/ https://www.ncbi.nlm.nih.gov/pubmed/23185310 http://dx.doi.org/10.1371/journal.pone.0049239 |
_version_ | 1782250552494653440 |
---|---|
author | Pareja-Tobes, Pablo Manrique, Marina Pareja-Tobes, Eduardo Pareja, Eduardo Tobes, Raquel |
author_facet | Pareja-Tobes, Pablo Manrique, Marina Pareja-Tobes, Eduardo Pareja, Eduardo Tobes, Raquel |
author_sort | Pareja-Tobes, Pablo |
collection | PubMed |
description | BG7 is a new system for de novo bacterial, archaeal and viral genome annotation based on a new approach specifically designed for annotating genomes sequenced with next generation sequencing technologies. The system is versatile and able to annotate genes even in the step of preliminary assembly of the genome. It is especially efficient detecting unexpected genes horizontally acquired from bacterial or archaeal distant genomes, phages, plasmids, and mobile elements. From the initial phases of the gene annotation process, BG7 exploits the massive availability of annotated protein sequences in databases. BG7 predicts ORFs and infers their function based on protein similarity with a wide set of reference proteins, integrating ORF prediction and functional annotation phases in just one step. BG7 is especially tolerant to sequencing errors in start and stop codons, to frameshifts, and to assembly or scaffolding errors. The system is also tolerant to the high level of gene fragmentation which is frequently found in not fully assembled genomes. BG7 current version – which is developed in Java, takes advantage of Amazon Web Services (AWS) cloud computing features, but it can also be run locally in any operating system. BG7 is a fast, automated and scalable system that can cope with the challenge of analyzing the huge amount of genomes that are being sequenced with NGS technologies. Its capabilities and efficiency were demonstrated in the 2011 EHEC Germany outbreak in which BG7 was used to get the first annotations right the next day after the first entero-hemorrhagic E. coli genome sequences were made publicly available. The suitability of BG7 for genome annotation has been proved for Illumina, 454, Ion Torrent, and PacBio sequencing technologies. Besides, thanks to its plasticity, our system could be very easily adapted to work with new technologies in the future. |
format | Online Article Text |
id | pubmed-3504008 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2012 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-35040082012-11-26 BG7: A New Approach for Bacterial Genome Annotation Designed for Next Generation Sequencing Data Pareja-Tobes, Pablo Manrique, Marina Pareja-Tobes, Eduardo Pareja, Eduardo Tobes, Raquel PLoS One Research Article BG7 is a new system for de novo bacterial, archaeal and viral genome annotation based on a new approach specifically designed for annotating genomes sequenced with next generation sequencing technologies. The system is versatile and able to annotate genes even in the step of preliminary assembly of the genome. It is especially efficient detecting unexpected genes horizontally acquired from bacterial or archaeal distant genomes, phages, plasmids, and mobile elements. From the initial phases of the gene annotation process, BG7 exploits the massive availability of annotated protein sequences in databases. BG7 predicts ORFs and infers their function based on protein similarity with a wide set of reference proteins, integrating ORF prediction and functional annotation phases in just one step. BG7 is especially tolerant to sequencing errors in start and stop codons, to frameshifts, and to assembly or scaffolding errors. The system is also tolerant to the high level of gene fragmentation which is frequently found in not fully assembled genomes. BG7 current version – which is developed in Java, takes advantage of Amazon Web Services (AWS) cloud computing features, but it can also be run locally in any operating system. BG7 is a fast, automated and scalable system that can cope with the challenge of analyzing the huge amount of genomes that are being sequenced with NGS technologies. Its capabilities and efficiency were demonstrated in the 2011 EHEC Germany outbreak in which BG7 was used to get the first annotations right the next day after the first entero-hemorrhagic E. coli genome sequences were made publicly available. The suitability of BG7 for genome annotation has been proved for Illumina, 454, Ion Torrent, and PacBio sequencing technologies. Besides, thanks to its plasticity, our system could be very easily adapted to work with new technologies in the future. Public Library of Science 2012-11-21 /pmc/articles/PMC3504008/ /pubmed/23185310 http://dx.doi.org/10.1371/journal.pone.0049239 Text en © 2012 Pareja-Tobes et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited. |
spellingShingle | Research Article Pareja-Tobes, Pablo Manrique, Marina Pareja-Tobes, Eduardo Pareja, Eduardo Tobes, Raquel BG7: A New Approach for Bacterial Genome Annotation Designed for Next Generation Sequencing Data |
title | BG7: A New Approach for Bacterial Genome Annotation Designed for Next Generation Sequencing Data |
title_full | BG7: A New Approach for Bacterial Genome Annotation Designed for Next Generation Sequencing Data |
title_fullStr | BG7: A New Approach for Bacterial Genome Annotation Designed for Next Generation Sequencing Data |
title_full_unstemmed | BG7: A New Approach for Bacterial Genome Annotation Designed for Next Generation Sequencing Data |
title_short | BG7: A New Approach for Bacterial Genome Annotation Designed for Next Generation Sequencing Data |
title_sort | bg7: a new approach for bacterial genome annotation designed for next generation sequencing data |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3504008/ https://www.ncbi.nlm.nih.gov/pubmed/23185310 http://dx.doi.org/10.1371/journal.pone.0049239 |
work_keys_str_mv | AT parejatobespablo bg7anewapproachforbacterialgenomeannotationdesignedfornextgenerationsequencingdata AT manriquemarina bg7anewapproachforbacterialgenomeannotationdesignedfornextgenerationsequencingdata AT parejatobeseduardo bg7anewapproachforbacterialgenomeannotationdesignedfornextgenerationsequencingdata AT parejaeduardo bg7anewapproachforbacterialgenomeannotationdesignedfornextgenerationsequencingdata AT tobesraquel bg7anewapproachforbacterialgenomeannotationdesignedfornextgenerationsequencingdata |