Cargando…

Limitations of the rhesus macaque draft genome assembly and annotation

Finished genome sequences and assemblies are available for only a few vertebrates. Thus, investigators studying many species must rely on draft genomes. Using the rhesus macaque as an example, we document the effects of sequencing errors, gaps in sequence and misassemblies on one automated gene mode...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Xiongfei, Goodsell, Joel, Norgren,, Robert B
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3426473/
https://www.ncbi.nlm.nih.gov/pubmed/22646658
http://dx.doi.org/10.1186/1471-2164-13-206
_version_ 1782241511615758336
author Zhang, Xiongfei
Goodsell, Joel
Norgren,, Robert B
author_facet Zhang, Xiongfei
Goodsell, Joel
Norgren,, Robert B
author_sort Zhang, Xiongfei
collection PubMed
description Finished genome sequences and assemblies are available for only a few vertebrates. Thus, investigators studying many species must rely on draft genomes. Using the rhesus macaque as an example, we document the effects of sequencing errors, gaps in sequence and misassemblies on one automated gene model pipeline, Gnomon. The combination of draft genome with automated gene finding software can result in spurious sequences. We estimate that approximately 50% of the rhesus gene models are missing, incomplete or incorrect. The problems identified in this work likely apply to all draft vertebrate genomes annotated with any automated gene model pipeline and thus represent a pervasive challenge to the analysis of draft genomes.
format Online
Article
Text
id pubmed-3426473
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-34264732012-08-24 Limitations of the rhesus macaque draft genome assembly and annotation Zhang, Xiongfei Goodsell, Joel Norgren,, Robert B BMC Genomics Correspondence Finished genome sequences and assemblies are available for only a few vertebrates. Thus, investigators studying many species must rely on draft genomes. Using the rhesus macaque as an example, we document the effects of sequencing errors, gaps in sequence and misassemblies on one automated gene model pipeline, Gnomon. The combination of draft genome with automated gene finding software can result in spurious sequences. We estimate that approximately 50% of the rhesus gene models are missing, incomplete or incorrect. The problems identified in this work likely apply to all draft vertebrate genomes annotated with any automated gene model pipeline and thus represent a pervasive challenge to the analysis of draft genomes. BioMed Central 2012-05-30 /pmc/articles/PMC3426473/ /pubmed/22646658 http://dx.doi.org/10.1186/1471-2164-13-206 Text en Copyright ©2012 Zhang et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Correspondence
Zhang, Xiongfei
Goodsell, Joel
Norgren,, Robert B
Limitations of the rhesus macaque draft genome assembly and annotation
title Limitations of the rhesus macaque draft genome assembly and annotation
title_full Limitations of the rhesus macaque draft genome assembly and annotation
title_fullStr Limitations of the rhesus macaque draft genome assembly and annotation
title_full_unstemmed Limitations of the rhesus macaque draft genome assembly and annotation
title_short Limitations of the rhesus macaque draft genome assembly and annotation
title_sort limitations of the rhesus macaque draft genome assembly and annotation
topic Correspondence
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3426473/
https://www.ncbi.nlm.nih.gov/pubmed/22646658
http://dx.doi.org/10.1186/1471-2164-13-206
work_keys_str_mv AT zhangxiongfei limitationsoftherhesusmacaquedraftgenomeassemblyandannotation
AT goodselljoel limitationsoftherhesusmacaquedraftgenomeassemblyandannotation
AT norgrenrobertb limitationsoftherhesusmacaquedraftgenomeassemblyandannotation