Cargando…
Re-Annotation of Protein-Coding Genes in 10 Complete Genomes of Neisseriaceae Family by Combining Similarity-Based and Composition-Based Methods
In this paper, we performed a comprehensive re-annotation of protein-coding genes by a systematic method combining composition- and similarity-based approaches in 10 complete bacterial genomes of the family Neisseriaceae. First, 418 hypothetical genes were predicted as non-coding using the compositi...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2013
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3686433/ https://www.ncbi.nlm.nih.gov/pubmed/23571676 http://dx.doi.org/10.1093/dnares/dst009 |
_version_ | 1782273790102732800 |
---|---|
author | Guo, Feng-Biao Xiong, Lifeng Teng, Jade L. L. Yuen, Kwok-Yung Lau, Susanna K. P. Woo, Patrick C. Y. |
author_facet | Guo, Feng-Biao Xiong, Lifeng Teng, Jade L. L. Yuen, Kwok-Yung Lau, Susanna K. P. Woo, Patrick C. Y. |
author_sort | Guo, Feng-Biao |
collection | PubMed |
description | In this paper, we performed a comprehensive re-annotation of protein-coding genes by a systematic method combining composition- and similarity-based approaches in 10 complete bacterial genomes of the family Neisseriaceae. First, 418 hypothetical genes were predicted as non-coding using the composition-based method and 413 were eliminated from the gene list. Both the scatter plot and cluster of orthologous groups (COG) fraction analyses supported the result. Second, from 20 to 400 hypothetical proteins were assigned with functions in each of the 10 strains based on the homology search. Among newly assigned functions, 397 are so detailed to have definite gene names. Third, 106 genes missed by the original annotations were picked up by an ab initio gene finder combined with similarity alignment. Transcriptional experiments validated the effectiveness of this method in Laribacter hongkongensis and Chromobacterium violaceum. Among the 106 newly found genes, some deserve particular interests. For example, 27 transposases were newly found in Neiserria meningitidis alpha14. In Neiserria gonorrhoeae NCCP11945, four new genes with putative functions and definite names (nusG, rpsN, rpmD and infA) were found and homologues of them usually are essential for survival in bacteria. The updated annotations for the 10 Neisseriaceae genomes provide a more accurate prediction of protein-coding genes and a more detailed functional information of hypothetical proteins. It will benefit research into the lifestyle, metabolism, environmental adaption and pathogenicity of the Neisseriaceae species. The re-annotation procedure could be used directly, or after the adaption of detailed methods, for checking annotations of any other bacterial or archaeal genomes. |
format | Online Article Text |
id | pubmed-3686433 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2013 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-36864332013-06-19 Re-Annotation of Protein-Coding Genes in 10 Complete Genomes of Neisseriaceae Family by Combining Similarity-Based and Composition-Based Methods Guo, Feng-Biao Xiong, Lifeng Teng, Jade L. L. Yuen, Kwok-Yung Lau, Susanna K. P. Woo, Patrick C. Y. DNA Res Full Papers In this paper, we performed a comprehensive re-annotation of protein-coding genes by a systematic method combining composition- and similarity-based approaches in 10 complete bacterial genomes of the family Neisseriaceae. First, 418 hypothetical genes were predicted as non-coding using the composition-based method and 413 were eliminated from the gene list. Both the scatter plot and cluster of orthologous groups (COG) fraction analyses supported the result. Second, from 20 to 400 hypothetical proteins were assigned with functions in each of the 10 strains based on the homology search. Among newly assigned functions, 397 are so detailed to have definite gene names. Third, 106 genes missed by the original annotations were picked up by an ab initio gene finder combined with similarity alignment. Transcriptional experiments validated the effectiveness of this method in Laribacter hongkongensis and Chromobacterium violaceum. Among the 106 newly found genes, some deserve particular interests. For example, 27 transposases were newly found in Neiserria meningitidis alpha14. In Neiserria gonorrhoeae NCCP11945, four new genes with putative functions and definite names (nusG, rpsN, rpmD and infA) were found and homologues of them usually are essential for survival in bacteria. The updated annotations for the 10 Neisseriaceae genomes provide a more accurate prediction of protein-coding genes and a more detailed functional information of hypothetical proteins. It will benefit research into the lifestyle, metabolism, environmental adaption and pathogenicity of the Neisseriaceae species. The re-annotation procedure could be used directly, or after the adaption of detailed methods, for checking annotations of any other bacterial or archaeal genomes. Oxford University Press 2013-06 2013-04-09 /pmc/articles/PMC3686433/ /pubmed/23571676 http://dx.doi.org/10.1093/dnares/dst009 Text en © The Author 2013. Published by Oxford University Press on behalf of Kazusa DNA Research Institute. http://creativecommons.org/licenses/by-nc/3.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by-nc/3.0/), which permits non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com. |
spellingShingle | Full Papers Guo, Feng-Biao Xiong, Lifeng Teng, Jade L. L. Yuen, Kwok-Yung Lau, Susanna K. P. Woo, Patrick C. Y. Re-Annotation of Protein-Coding Genes in 10 Complete Genomes of Neisseriaceae Family by Combining Similarity-Based and Composition-Based Methods |
title | Re-Annotation of Protein-Coding Genes in 10 Complete Genomes of Neisseriaceae Family by Combining Similarity-Based and Composition-Based Methods |
title_full | Re-Annotation of Protein-Coding Genes in 10 Complete Genomes of Neisseriaceae Family by Combining Similarity-Based and Composition-Based Methods |
title_fullStr | Re-Annotation of Protein-Coding Genes in 10 Complete Genomes of Neisseriaceae Family by Combining Similarity-Based and Composition-Based Methods |
title_full_unstemmed | Re-Annotation of Protein-Coding Genes in 10 Complete Genomes of Neisseriaceae Family by Combining Similarity-Based and Composition-Based Methods |
title_short | Re-Annotation of Protein-Coding Genes in 10 Complete Genomes of Neisseriaceae Family by Combining Similarity-Based and Composition-Based Methods |
title_sort | re-annotation of protein-coding genes in 10 complete genomes of neisseriaceae family by combining similarity-based and composition-based methods |
topic | Full Papers |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3686433/ https://www.ncbi.nlm.nih.gov/pubmed/23571676 http://dx.doi.org/10.1093/dnares/dst009 |
work_keys_str_mv | AT guofengbiao reannotationofproteincodinggenesin10completegenomesofneisseriaceaefamilybycombiningsimilaritybasedandcompositionbasedmethods AT xionglifeng reannotationofproteincodinggenesin10completegenomesofneisseriaceaefamilybycombiningsimilaritybasedandcompositionbasedmethods AT tengjadell reannotationofproteincodinggenesin10completegenomesofneisseriaceaefamilybycombiningsimilaritybasedandcompositionbasedmethods AT yuenkwokyung reannotationofproteincodinggenesin10completegenomesofneisseriaceaefamilybycombiningsimilaritybasedandcompositionbasedmethods AT laususannakp reannotationofproteincodinggenesin10completegenomesofneisseriaceaefamilybycombiningsimilaritybasedandcompositionbasedmethods AT woopatrickcy reannotationofproteincodinggenesin10completegenomesofneisseriaceaefamilybycombiningsimilaritybasedandcompositionbasedmethods |