Cargando…

Tandem mass spectrometry data quality assessment by self-convolution

BACKGROUND: Many algorithms have been developed for deciphering the tandem mass spectrometry (MS) data sets. They can be essentially clustered into two classes. The first performs searches on theoretical mass spectrum database, while the second based itself on de novo sequencing from raw mass spectr...

Descripción completa

Detalles Bibliográficos
Autores principales:	Choo, Keng Wah, Tham, Wai Mun
Formato:	Texto
Lenguaje:	English
Publicado:	BioMed Central 2007
Materias:	Methodology Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2164967/ https://www.ncbi.nlm.nih.gov/pubmed/17880728 http://dx.doi.org/10.1186/1471-2105-8-352

_version_	1782144835925311488
author	Choo, Keng Wah Tham, Wai Mun
author_facet	Choo, Keng Wah Tham, Wai Mun
author_sort	Choo, Keng Wah
collection	PubMed
description	BACKGROUND: Many algorithms have been developed for deciphering the tandem mass spectrometry (MS) data sets. They can be essentially clustered into two classes. The first performs searches on theoretical mass spectrum database, while the second based itself on de novo sequencing from raw mass spectrometry data. It was noted that the quality of mass spectra affects significantly the protein identification processes in both instances. This prompted the authors to explore ways to measure the quality of MS data sets before subjecting them to the protein identification algorithms, thus allowing for more meaningful searches and increased confidence level of proteins identified. RESULTS: The proposed method measures the qualities of MS data sets based on the symmetric property of b- and y-ion peaks present in a MS spectrum. Self-convolution on MS data and its time-reversal copy was employed. Due to the symmetric nature of b-ions and y-ions peaks, the self-convolution result of a good spectrum would produce a highest mid point intensity peak. To reduce processing time, self-convolution was achieved using Fast Fourier Transform and its inverse transform, followed by the removal of the "DC" (Direct Current) component and the normalisation of the data set. The quality score was defined as the ratio of the intensity at the mid point to the remaining peaks of the convolution result. The method was validated using both theoretical mass spectra, with various permutations, and several real MS data sets. The results were encouraging, revealing a high percentage of positive prediction rates for spectra with good quality scores. CONCLUSION: We have demonstrated in this work a method for determining the quality of tandem MS data set. By pre-determining the quality of tandem MS data before subjecting them to protein identification algorithms, spurious protein predictions due to poor tandem MS data are avoided, giving scientists greater confidence in the predicted results. We conclude that the algorithm performs well and could potentially be used as a pre-processing for all mass spectrometry based protein identification tools.
format	Text
id	pubmed-2164967
institution	National Center for Biotechnology Information
language	English
publishDate	2007
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-21649672008-01-02 Tandem mass spectrometry data quality assessment by self-convolution Choo, Keng Wah Tham, Wai Mun BMC Bioinformatics Methodology Article BACKGROUND: Many algorithms have been developed for deciphering the tandem mass spectrometry (MS) data sets. They can be essentially clustered into two classes. The first performs searches on theoretical mass spectrum database, while the second based itself on de novo sequencing from raw mass spectrometry data. It was noted that the quality of mass spectra affects significantly the protein identification processes in both instances. This prompted the authors to explore ways to measure the quality of MS data sets before subjecting them to the protein identification algorithms, thus allowing for more meaningful searches and increased confidence level of proteins identified. RESULTS: The proposed method measures the qualities of MS data sets based on the symmetric property of b- and y-ion peaks present in a MS spectrum. Self-convolution on MS data and its time-reversal copy was employed. Due to the symmetric nature of b-ions and y-ions peaks, the self-convolution result of a good spectrum would produce a highest mid point intensity peak. To reduce processing time, self-convolution was achieved using Fast Fourier Transform and its inverse transform, followed by the removal of the "DC" (Direct Current) component and the normalisation of the data set. The quality score was defined as the ratio of the intensity at the mid point to the remaining peaks of the convolution result. The method was validated using both theoretical mass spectra, with various permutations, and several real MS data sets. The results were encouraging, revealing a high percentage of positive prediction rates for spectra with good quality scores. CONCLUSION: We have demonstrated in this work a method for determining the quality of tandem MS data set. By pre-determining the quality of tandem MS data before subjecting them to protein identification algorithms, spurious protein predictions due to poor tandem MS data are avoided, giving scientists greater confidence in the predicted results. We conclude that the algorithm performs well and could potentially be used as a pre-processing for all mass spectrometry based protein identification tools. BioMed Central 2007-09-20 /pmc/articles/PMC2164967/ /pubmed/17880728 http://dx.doi.org/10.1186/1471-2105-8-352 Text en Copyright © 2007 Choo and Tham; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Methodology Article Choo, Keng Wah Tham, Wai Mun Tandem mass spectrometry data quality assessment by self-convolution
title	Tandem mass spectrometry data quality assessment by self-convolution
title_full	Tandem mass spectrometry data quality assessment by self-convolution
title_fullStr	Tandem mass spectrometry data quality assessment by self-convolution
title_full_unstemmed	Tandem mass spectrometry data quality assessment by self-convolution
title_short	Tandem mass spectrometry data quality assessment by self-convolution
title_sort	tandem mass spectrometry data quality assessment by self-convolution
topic	Methodology Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2164967/ https://www.ncbi.nlm.nih.gov/pubmed/17880728 http://dx.doi.org/10.1186/1471-2105-8-352
work_keys_str_mv	AT chookengwah tandemmassspectrometrydataqualityassessmentbyselfconvolution AT thamwaimun tandemmassspectrometrydataqualityassessmentbyselfconvolution

Tandem mass spectrometry data quality assessment by self-convolution

Ejemplares similares