Cargando…
Correction of the Caulobacter crescentus NA1000 Genome Annotation
Bacterial genome annotations are accumulating rapidly in the GenBank database and the use of automated annotation technologies to create these annotations has become the norm. However, these automated methods commonly result in a small, but significant percentage of genome annotation errors. To impr...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3951458/ https://www.ncbi.nlm.nih.gov/pubmed/24621776 http://dx.doi.org/10.1371/journal.pone.0091668 |
_version_ | 1782307124694482944 |
---|---|
author | Ely, Bert Scott, LaTia Etheredge |
author_facet | Ely, Bert Scott, LaTia Etheredge |
author_sort | Ely, Bert |
collection | PubMed |
description | Bacterial genome annotations are accumulating rapidly in the GenBank database and the use of automated annotation technologies to create these annotations has become the norm. However, these automated methods commonly result in a small, but significant percentage of genome annotation errors. To improve accuracy and reliability, we analyzed the Caulobacter crescentus NA1000 genome utilizing computer programs Artemis and MICheck to manually examine the third codon position GC content, alignment to a third codon position GC frame plot peak, and matches in the GenBank database. We identified 11 new genes, modified the start site of 113 genes, and changed the reading frame of 38 genes that had been incorrectly annotated. Furthermore, our manual method of identifying protein-coding genes allowed us to remove 112 non-coding regions that had been designated as coding regions. The improved NA1000 genome annotation resulted in a reduction in the use of rare codons since noncoding regions with atypical codon usage were removed from the annotation and 49 new coding regions were added to the annotation. Thus, a more accurate codon usage table was generated as well. These results demonstrate that a comparison of the location of peaks third codon position GC content to the location of protein coding regions could be used to verify the annotation of any genome that has a GC content that is greater than 60%. |
format | Online Article Text |
id | pubmed-3951458 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-39514582014-03-13 Correction of the Caulobacter crescentus NA1000 Genome Annotation Ely, Bert Scott, LaTia Etheredge PLoS One Research Article Bacterial genome annotations are accumulating rapidly in the GenBank database and the use of automated annotation technologies to create these annotations has become the norm. However, these automated methods commonly result in a small, but significant percentage of genome annotation errors. To improve accuracy and reliability, we analyzed the Caulobacter crescentus NA1000 genome utilizing computer programs Artemis and MICheck to manually examine the third codon position GC content, alignment to a third codon position GC frame plot peak, and matches in the GenBank database. We identified 11 new genes, modified the start site of 113 genes, and changed the reading frame of 38 genes that had been incorrectly annotated. Furthermore, our manual method of identifying protein-coding genes allowed us to remove 112 non-coding regions that had been designated as coding regions. The improved NA1000 genome annotation resulted in a reduction in the use of rare codons since noncoding regions with atypical codon usage were removed from the annotation and 49 new coding regions were added to the annotation. Thus, a more accurate codon usage table was generated as well. These results demonstrate that a comparison of the location of peaks third codon position GC content to the location of protein coding regions could be used to verify the annotation of any genome that has a GC content that is greater than 60%. Public Library of Science 2014-03-12 /pmc/articles/PMC3951458/ /pubmed/24621776 http://dx.doi.org/10.1371/journal.pone.0091668 Text en © 2014 Ely, Scott http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited. |
spellingShingle | Research Article Ely, Bert Scott, LaTia Etheredge Correction of the Caulobacter crescentus NA1000 Genome Annotation |
title | Correction of the Caulobacter crescentus NA1000 Genome Annotation |
title_full | Correction of the Caulobacter crescentus NA1000 Genome Annotation |
title_fullStr | Correction of the Caulobacter crescentus NA1000 Genome Annotation |
title_full_unstemmed | Correction of the Caulobacter crescentus NA1000 Genome Annotation |
title_short | Correction of the Caulobacter crescentus NA1000 Genome Annotation |
title_sort | correction of the caulobacter crescentus na1000 genome annotation |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3951458/ https://www.ncbi.nlm.nih.gov/pubmed/24621776 http://dx.doi.org/10.1371/journal.pone.0091668 |
work_keys_str_mv | AT elybert correctionofthecaulobactercrescentusna1000genomeannotation AT scottlatiaetheredge correctionofthecaulobactercrescentusna1000genomeannotation |