Cargando…
MAKER2: an annotation pipeline and genome-database management tool for second-generation genome projects
BACKGROUND: Second-generation sequencing technologies are precipitating major shifts with regards to what kinds of genomes are being sequenced and how they are annotated. While the first generation of genome projects focused on well-studied model organisms, many of today's projects involve exot...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2011
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3280279/ https://www.ncbi.nlm.nih.gov/pubmed/22192575 http://dx.doi.org/10.1186/1471-2105-12-491 |
_version_ | 1782223806881857536 |
---|---|
author | Holt, Carson Yandell, Mark |
author_facet | Holt, Carson Yandell, Mark |
author_sort | Holt, Carson |
collection | PubMed |
description | BACKGROUND: Second-generation sequencing technologies are precipitating major shifts with regards to what kinds of genomes are being sequenced and how they are annotated. While the first generation of genome projects focused on well-studied model organisms, many of today's projects involve exotic organisms whose genomes are largely terra incognita. This complicates their annotation, because unlike first-generation projects, there are no pre-existing 'gold-standard' gene-models with which to train gene-finders. Improvements in genome assembly and the wide availability of mRNA-seq data are also creating opportunities to update and re-annotate previously published genome annotations. Today's genome projects are thus in need of new genome annotation tools that can meet the challenges and opportunities presented by second-generation sequencing technologies. RESULTS: We present MAKER2, a genome annotation and data management tool designed for second-generation genome projects. MAKER2 is a multi-threaded, parallelized application that can process second-generation datasets of virtually any size. We show that MAKER2 can produce accurate annotations for novel genomes where training-data are limited, of low quality or even non-existent. MAKER2 also provides an easy means to use mRNA-seq data to improve annotation quality; and it can use these data to update legacy annotations, significantly improving their quality. We also show that MAKER2 can evaluate the quality of genome annotations, and identify and prioritize problematic annotations for manual review. CONCLUSIONS: MAKER2 is the first annotation engine specifically designed for second-generation genome projects. MAKER2 scales to datasets of any size, requires little in the way of training data, and can use mRNA-seq data to improve annotation quality. It can also update and manage legacy genome annotation datasets. |
format | Online Article Text |
id | pubmed-3280279 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2011 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-32802792012-02-16 MAKER2: an annotation pipeline and genome-database management tool for second-generation genome projects Holt, Carson Yandell, Mark BMC Bioinformatics Software BACKGROUND: Second-generation sequencing technologies are precipitating major shifts with regards to what kinds of genomes are being sequenced and how they are annotated. While the first generation of genome projects focused on well-studied model organisms, many of today's projects involve exotic organisms whose genomes are largely terra incognita. This complicates their annotation, because unlike first-generation projects, there are no pre-existing 'gold-standard' gene-models with which to train gene-finders. Improvements in genome assembly and the wide availability of mRNA-seq data are also creating opportunities to update and re-annotate previously published genome annotations. Today's genome projects are thus in need of new genome annotation tools that can meet the challenges and opportunities presented by second-generation sequencing technologies. RESULTS: We present MAKER2, a genome annotation and data management tool designed for second-generation genome projects. MAKER2 is a multi-threaded, parallelized application that can process second-generation datasets of virtually any size. We show that MAKER2 can produce accurate annotations for novel genomes where training-data are limited, of low quality or even non-existent. MAKER2 also provides an easy means to use mRNA-seq data to improve annotation quality; and it can use these data to update legacy annotations, significantly improving their quality. We also show that MAKER2 can evaluate the quality of genome annotations, and identify and prioritize problematic annotations for manual review. CONCLUSIONS: MAKER2 is the first annotation engine specifically designed for second-generation genome projects. MAKER2 scales to datasets of any size, requires little in the way of training data, and can use mRNA-seq data to improve annotation quality. It can also update and manage legacy genome annotation datasets. BioMed Central 2011-12-22 /pmc/articles/PMC3280279/ /pubmed/22192575 http://dx.doi.org/10.1186/1471-2105-12-491 Text en Copyright © 2011 Holt and Yandell; licensee BioMed Central Ltd. https://creativecommons.org/licenses/by/2.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0 (https://creativecommons.org/licenses/by/2.0/) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Software Holt, Carson Yandell, Mark MAKER2: an annotation pipeline and genome-database management tool for second-generation genome projects |
title | MAKER2: an annotation pipeline and genome-database management tool for second-generation genome projects |
title_full | MAKER2: an annotation pipeline and genome-database management tool for second-generation genome projects |
title_fullStr | MAKER2: an annotation pipeline and genome-database management tool for second-generation genome projects |
title_full_unstemmed | MAKER2: an annotation pipeline and genome-database management tool for second-generation genome projects |
title_short | MAKER2: an annotation pipeline and genome-database management tool for second-generation genome projects |
title_sort | maker2: an annotation pipeline and genome-database management tool for second-generation genome projects |
topic | Software |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3280279/ https://www.ncbi.nlm.nih.gov/pubmed/22192575 http://dx.doi.org/10.1186/1471-2105-12-491 |
work_keys_str_mv | AT holtcarson maker2anannotationpipelineandgenomedatabasemanagementtoolforsecondgenerationgenomeprojects AT yandellmark maker2anannotationpipelineandgenomedatabasemanagementtoolforsecondgenerationgenomeprojects |