Cargando…

High accuracy mass spectrometry analysis as a tool to verify and improve gene annotation using Mycobacterium tuberculosis as an example

BACKGROUND: While the genomic annotations of diverse lineages of the Mycobacterium tuberculosis complex are available, divergences between gene prediction methods are still a challenge for unbiased protein dataset generation. M. tuberculosis gene annotation is an example, where the most used dataset...

Descripción completa

Detalles Bibliográficos
Autores principales: de Souza, Gustavo A, Målen, Hiwa, Søfteland, Tina, Sælensminde, Gisle, Prasad, Swati, Jonassen, Inge, Wiker, Harald G
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2008
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2483986/
https://www.ncbi.nlm.nih.gov/pubmed/18597682
http://dx.doi.org/10.1186/1471-2164-9-316
_version_ 1782158082300706816
author de Souza, Gustavo A
Målen, Hiwa
Søfteland, Tina
Sælensminde, Gisle
Prasad, Swati
Jonassen, Inge
Wiker, Harald G
author_facet de Souza, Gustavo A
Målen, Hiwa
Søfteland, Tina
Sælensminde, Gisle
Prasad, Swati
Jonassen, Inge
Wiker, Harald G
author_sort de Souza, Gustavo A
collection PubMed
description BACKGROUND: While the genomic annotations of diverse lineages of the Mycobacterium tuberculosis complex are available, divergences between gene prediction methods are still a challenge for unbiased protein dataset generation. M. tuberculosis gene annotation is an example, where the most used datasets from two independent institutions (Sanger Institute and Institute of Genomic Research-TIGR) differ up to 12% in the number of annotated open reading frames, and 46% of the genes contained in both annotations have different start codons. Such differences emphasize the importance of the identification of the sequence of protein products to validate each gene annotation including its sequence coding area. RESULTS: With this objective, we submitted a culture filtrate sample from M. tuberculosis to a high-accuracy LTQ-Orbitrap mass spectrometer analysis and applied refined N-terminal prediction to perform comparison of two gene annotations. From a total of 449 proteins identified from the MS data, we validated 35 tryptic peptides that were specific to one of the two datasets, representing 24 different proteins. From those, 5 proteins were only annotated in the Sanger database. In the remaining proteins, the observed differences were due to differences in annotation of transcriptional start sites. CONCLUSION: Our results indicate that, even in a less complex sample likely to represent only 10% of the bacterial proteome, we were still able to detect major differences between different gene annotation approaches. This gives hope that high-throughput proteomics techniques can be used to improve and validate gene annotations, and in particular for verification of high-throughput, automatic gene annotations.
format Text
id pubmed-2483986
institution National Center for Biotechnology Information
language English
publishDate 2008
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-24839862008-07-26 High accuracy mass spectrometry analysis as a tool to verify and improve gene annotation using Mycobacterium tuberculosis as an example de Souza, Gustavo A Målen, Hiwa Søfteland, Tina Sælensminde, Gisle Prasad, Swati Jonassen, Inge Wiker, Harald G BMC Genomics Research Article BACKGROUND: While the genomic annotations of diverse lineages of the Mycobacterium tuberculosis complex are available, divergences between gene prediction methods are still a challenge for unbiased protein dataset generation. M. tuberculosis gene annotation is an example, where the most used datasets from two independent institutions (Sanger Institute and Institute of Genomic Research-TIGR) differ up to 12% in the number of annotated open reading frames, and 46% of the genes contained in both annotations have different start codons. Such differences emphasize the importance of the identification of the sequence of protein products to validate each gene annotation including its sequence coding area. RESULTS: With this objective, we submitted a culture filtrate sample from M. tuberculosis to a high-accuracy LTQ-Orbitrap mass spectrometer analysis and applied refined N-terminal prediction to perform comparison of two gene annotations. From a total of 449 proteins identified from the MS data, we validated 35 tryptic peptides that were specific to one of the two datasets, representing 24 different proteins. From those, 5 proteins were only annotated in the Sanger database. In the remaining proteins, the observed differences were due to differences in annotation of transcriptional start sites. CONCLUSION: Our results indicate that, even in a less complex sample likely to represent only 10% of the bacterial proteome, we were still able to detect major differences between different gene annotation approaches. This gives hope that high-throughput proteomics techniques can be used to improve and validate gene annotations, and in particular for verification of high-throughput, automatic gene annotations. BioMed Central 2008-07-02 /pmc/articles/PMC2483986/ /pubmed/18597682 http://dx.doi.org/10.1186/1471-2164-9-316 Text en Copyright © 2008 de Souza et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
de Souza, Gustavo A
Målen, Hiwa
Søfteland, Tina
Sælensminde, Gisle
Prasad, Swati
Jonassen, Inge
Wiker, Harald G
High accuracy mass spectrometry analysis as a tool to verify and improve gene annotation using Mycobacterium tuberculosis as an example
title High accuracy mass spectrometry analysis as a tool to verify and improve gene annotation using Mycobacterium tuberculosis as an example
title_full High accuracy mass spectrometry analysis as a tool to verify and improve gene annotation using Mycobacterium tuberculosis as an example
title_fullStr High accuracy mass spectrometry analysis as a tool to verify and improve gene annotation using Mycobacterium tuberculosis as an example
title_full_unstemmed High accuracy mass spectrometry analysis as a tool to verify and improve gene annotation using Mycobacterium tuberculosis as an example
title_short High accuracy mass spectrometry analysis as a tool to verify and improve gene annotation using Mycobacterium tuberculosis as an example
title_sort high accuracy mass spectrometry analysis as a tool to verify and improve gene annotation using mycobacterium tuberculosis as an example
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2483986/
https://www.ncbi.nlm.nih.gov/pubmed/18597682
http://dx.doi.org/10.1186/1471-2164-9-316
work_keys_str_mv AT desouzagustavoa highaccuracymassspectrometryanalysisasatooltoverifyandimprovegeneannotationusingmycobacteriumtuberculosisasanexample
AT malenhiwa highaccuracymassspectrometryanalysisasatooltoverifyandimprovegeneannotationusingmycobacteriumtuberculosisasanexample
AT søftelandtina highaccuracymassspectrometryanalysisasatooltoverifyandimprovegeneannotationusingmycobacteriumtuberculosisasanexample
AT sælensmindegisle highaccuracymassspectrometryanalysisasatooltoverifyandimprovegeneannotationusingmycobacteriumtuberculosisasanexample
AT prasadswati highaccuracymassspectrometryanalysisasatooltoverifyandimprovegeneannotationusingmycobacteriumtuberculosisasanexample
AT jonasseninge highaccuracymassspectrometryanalysisasatooltoverifyandimprovegeneannotationusingmycobacteriumtuberculosisasanexample
AT wikerharaldg highaccuracymassspectrometryanalysisasatooltoverifyandimprovegeneannotationusingmycobacteriumtuberculosisasanexample