Cargando…

Mass spectrometry data processing using zero-crossing lines in multi-scale of Gaussian derivative wavelet

Motivation: Peaks are the key information in mass spectrometry (MS) which has been increasingly used to discover diseases-related proteomic patterns. Peak detection is an essential step for MS-based proteomic data analysis. Recently, several peak detection algorithms have been proposed. However, in...

Descripción completa

Detalles Bibliográficos
Autores principales: Nguyen, Nha, Huang, Heng, Oraintara, Soontorn, Vo, An
Formato: Texto
Lenguaje:English
Publicado: Oxford University Press 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2935426/
https://www.ncbi.nlm.nih.gov/pubmed/20823336
http://dx.doi.org/10.1093/bioinformatics/btq397
_version_ 1782186400286769152
author Nguyen, Nha
Huang, Heng
Oraintara, Soontorn
Vo, An
author_facet Nguyen, Nha
Huang, Heng
Oraintara, Soontorn
Vo, An
author_sort Nguyen, Nha
collection PubMed
description Motivation: Peaks are the key information in mass spectrometry (MS) which has been increasingly used to discover diseases-related proteomic patterns. Peak detection is an essential step for MS-based proteomic data analysis. Recently, several peak detection algorithms have been proposed. However, in these algorithms, there are three major deficiencies: (i) because the noise is often removed, the true signal could also be removed; (ii) baseline removal step may get rid of true peaks and create new false peaks; (iii) in peak quantification step, a threshold of signal-to-noise ratio (SNR) is usually used to remove false peaks; however, noise estimations in SNR calculation are often inaccurate in either time or wavelet domain. In this article, we propose new algorithms to solve these problems. First, we use bivariate shrinkage estimator in stationary wavelet domain to avoid removing true peaks in denoising step. Second, without baseline removal, zero-crossing lines in multi-scale of derivative Gaussian wavelets are investigated with mixture of Gaussian to estimate discriminative parameters of peaks. Third, in quantification step, the frequency, SD, height and rank of peaks are used to detect both high and small energy peaks with robustness to noise. Results: We propose a novel Gaussian Derivative Wavelet (GDWavelet) method to more accurately detect true peaks with a lower false discovery rate than existing methods. The proposed GDWavelet method has been performed on the real Surface-Enhanced Laser Desorption/Ionization Time-Of-Flight (SELDI-TOF) spectrum with known polypeptide positions and on two synthetic data with Gaussian and real noise. All experimental results demonstrate that our method outperforms other commonly used methods. The standard receiver operating characteristic (ROC) curves are used to evaluate the experimental results. Availability: http://ranger.uta.edu/∼heng/MS/GDWavelet.html or http://www.naaan.org/nhanguyen/archive.htm Contact: heng@uta.edu
format Text
id pubmed-2935426
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-29354262010-09-08 Mass spectrometry data processing using zero-crossing lines in multi-scale of Gaussian derivative wavelet Nguyen, Nha Huang, Heng Oraintara, Soontorn Vo, An Bioinformatics Eccb 2010 Conference Proceedings September 26 to September 29, 2010, Ghent, Belgium Motivation: Peaks are the key information in mass spectrometry (MS) which has been increasingly used to discover diseases-related proteomic patterns. Peak detection is an essential step for MS-based proteomic data analysis. Recently, several peak detection algorithms have been proposed. However, in these algorithms, there are three major deficiencies: (i) because the noise is often removed, the true signal could also be removed; (ii) baseline removal step may get rid of true peaks and create new false peaks; (iii) in peak quantification step, a threshold of signal-to-noise ratio (SNR) is usually used to remove false peaks; however, noise estimations in SNR calculation are often inaccurate in either time or wavelet domain. In this article, we propose new algorithms to solve these problems. First, we use bivariate shrinkage estimator in stationary wavelet domain to avoid removing true peaks in denoising step. Second, without baseline removal, zero-crossing lines in multi-scale of derivative Gaussian wavelets are investigated with mixture of Gaussian to estimate discriminative parameters of peaks. Third, in quantification step, the frequency, SD, height and rank of peaks are used to detect both high and small energy peaks with robustness to noise. Results: We propose a novel Gaussian Derivative Wavelet (GDWavelet) method to more accurately detect true peaks with a lower false discovery rate than existing methods. The proposed GDWavelet method has been performed on the real Surface-Enhanced Laser Desorption/Ionization Time-Of-Flight (SELDI-TOF) spectrum with known polypeptide positions and on two synthetic data with Gaussian and real noise. All experimental results demonstrate that our method outperforms other commonly used methods. The standard receiver operating characteristic (ROC) curves are used to evaluate the experimental results. Availability: http://ranger.uta.edu/∼heng/MS/GDWavelet.html or http://www.naaan.org/nhanguyen/archive.htm Contact: heng@uta.edu Oxford University Press 2010-09-15 2010-09-04 /pmc/articles/PMC2935426/ /pubmed/20823336 http://dx.doi.org/10.1093/bioinformatics/btq397 Text en © The Author(s) 2010. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/2.0/uk/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.5), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Eccb 2010 Conference Proceedings September 26 to September 29, 2010, Ghent, Belgium
Nguyen, Nha
Huang, Heng
Oraintara, Soontorn
Vo, An
Mass spectrometry data processing using zero-crossing lines in multi-scale of Gaussian derivative wavelet
title Mass spectrometry data processing using zero-crossing lines in multi-scale of Gaussian derivative wavelet
title_full Mass spectrometry data processing using zero-crossing lines in multi-scale of Gaussian derivative wavelet
title_fullStr Mass spectrometry data processing using zero-crossing lines in multi-scale of Gaussian derivative wavelet
title_full_unstemmed Mass spectrometry data processing using zero-crossing lines in multi-scale of Gaussian derivative wavelet
title_short Mass spectrometry data processing using zero-crossing lines in multi-scale of Gaussian derivative wavelet
title_sort mass spectrometry data processing using zero-crossing lines in multi-scale of gaussian derivative wavelet
topic Eccb 2010 Conference Proceedings September 26 to September 29, 2010, Ghent, Belgium
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2935426/
https://www.ncbi.nlm.nih.gov/pubmed/20823336
http://dx.doi.org/10.1093/bioinformatics/btq397
work_keys_str_mv AT nguyennha massspectrometrydataprocessingusingzerocrossinglinesinmultiscaleofgaussianderivativewavelet
AT huangheng massspectrometrydataprocessingusingzerocrossinglinesinmultiscaleofgaussianderivativewavelet
AT oraintarasoontorn massspectrometrydataprocessingusingzerocrossinglinesinmultiscaleofgaussianderivativewavelet
AT voan massspectrometrydataprocessingusingzerocrossinglinesinmultiscaleofgaussianderivativewavelet