Cargando…

A proteogenomic analysis of Shigella flexneri using 2D LC-MALDI TOF/TOF

BACKGROUND: New strategies for high-throughput sequencing are constantly appearing, leading to a great increase in the number of completely sequenced genomes. Unfortunately, computational genome annotation is out of step with this progress. Thus, the accurate annotation of these genomes has become a...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhao, Lina, Liu, Liguo, Leng, Wenchuan, Wei, Candong, Jin, Qi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3219829/
https://www.ncbi.nlm.nih.gov/pubmed/22032405
http://dx.doi.org/10.1186/1471-2164-12-528
_version_ 1782216900223172608
author Zhao, Lina
Liu, Liguo
Leng, Wenchuan
Wei, Candong
Jin, Qi
author_facet Zhao, Lina
Liu, Liguo
Leng, Wenchuan
Wei, Candong
Jin, Qi
author_sort Zhao, Lina
collection PubMed
description BACKGROUND: New strategies for high-throughput sequencing are constantly appearing, leading to a great increase in the number of completely sequenced genomes. Unfortunately, computational genome annotation is out of step with this progress. Thus, the accurate annotation of these genomes has become a bottleneck of knowledge acquisition. RESULTS: We exploited a proteogenomic approach to improve conventional genome annotation by integrating proteomic data with genomic information. Using Shigella flexneri 2a as a model, we identified total 823 proteins, including 187 hypothetical proteins. Among them, three annotated ORFs were extended upstream through comprehensive analysis against an in-house N-terminal extension database. Two genes, which could not be translated to their full length because of stop codon 'mutations' induced by genome sequencing errors, were revised and annotated as fully functional genes. Above all, seven new ORFs were discovered, which were not predicted in S. flexneri 2a str.301 by any other annotation approaches. The transcripts of four novel ORFs were confirmed by RT-PCR assay. Additionally, most of these novel ORFs were overlapping genes, some even nested within the coding region of other known genes. CONCLUSIONS: Our findings demonstrate that current Shigella genome annotation methods are not perfect and need to be improved. Apart from the validation of predicted genes at the protein level, the additional features of proteogenomic tools include revision of annotation errors and discovery of novel ORFs. The complementary dataset could provide more targets for those interested in Shigella to perform functional studies.
format Online
Article
Text
id pubmed-3219829
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-32198292011-11-18 A proteogenomic analysis of Shigella flexneri using 2D LC-MALDI TOF/TOF Zhao, Lina Liu, Liguo Leng, Wenchuan Wei, Candong Jin, Qi BMC Genomics Research Article BACKGROUND: New strategies for high-throughput sequencing are constantly appearing, leading to a great increase in the number of completely sequenced genomes. Unfortunately, computational genome annotation is out of step with this progress. Thus, the accurate annotation of these genomes has become a bottleneck of knowledge acquisition. RESULTS: We exploited a proteogenomic approach to improve conventional genome annotation by integrating proteomic data with genomic information. Using Shigella flexneri 2a as a model, we identified total 823 proteins, including 187 hypothetical proteins. Among them, three annotated ORFs were extended upstream through comprehensive analysis against an in-house N-terminal extension database. Two genes, which could not be translated to their full length because of stop codon 'mutations' induced by genome sequencing errors, were revised and annotated as fully functional genes. Above all, seven new ORFs were discovered, which were not predicted in S. flexneri 2a str.301 by any other annotation approaches. The transcripts of four novel ORFs were confirmed by RT-PCR assay. Additionally, most of these novel ORFs were overlapping genes, some even nested within the coding region of other known genes. CONCLUSIONS: Our findings demonstrate that current Shigella genome annotation methods are not perfect and need to be improved. Apart from the validation of predicted genes at the protein level, the additional features of proteogenomic tools include revision of annotation errors and discovery of novel ORFs. The complementary dataset could provide more targets for those interested in Shigella to perform functional studies. BioMed Central 2011-10-28 /pmc/articles/PMC3219829/ /pubmed/22032405 http://dx.doi.org/10.1186/1471-2164-12-528 Text en Copyright ©2011 Zhao et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Zhao, Lina
Liu, Liguo
Leng, Wenchuan
Wei, Candong
Jin, Qi
A proteogenomic analysis of Shigella flexneri using 2D LC-MALDI TOF/TOF
title A proteogenomic analysis of Shigella flexneri using 2D LC-MALDI TOF/TOF
title_full A proteogenomic analysis of Shigella flexneri using 2D LC-MALDI TOF/TOF
title_fullStr A proteogenomic analysis of Shigella flexneri using 2D LC-MALDI TOF/TOF
title_full_unstemmed A proteogenomic analysis of Shigella flexneri using 2D LC-MALDI TOF/TOF
title_short A proteogenomic analysis of Shigella flexneri using 2D LC-MALDI TOF/TOF
title_sort proteogenomic analysis of shigella flexneri using 2d lc-maldi tof/tof
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3219829/
https://www.ncbi.nlm.nih.gov/pubmed/22032405
http://dx.doi.org/10.1186/1471-2164-12-528
work_keys_str_mv AT zhaolina aproteogenomicanalysisofshigellaflexneriusing2dlcmalditoftof
AT liuliguo aproteogenomicanalysisofshigellaflexneriusing2dlcmalditoftof
AT lengwenchuan aproteogenomicanalysisofshigellaflexneriusing2dlcmalditoftof
AT weicandong aproteogenomicanalysisofshigellaflexneriusing2dlcmalditoftof
AT jinqi aproteogenomicanalysisofshigellaflexneriusing2dlcmalditoftof
AT zhaolina proteogenomicanalysisofshigellaflexneriusing2dlcmalditoftof
AT liuliguo proteogenomicanalysisofshigellaflexneriusing2dlcmalditoftof
AT lengwenchuan proteogenomicanalysisofshigellaflexneriusing2dlcmalditoftof
AT weicandong proteogenomicanalysisofshigellaflexneriusing2dlcmalditoftof
AT jinqi proteogenomicanalysisofshigellaflexneriusing2dlcmalditoftof