Cargando…

Reproducible Cancer Biomarker Discovery in SELDI-TOF MS Using Different Pre-Processing Algorithms

BACKGROUND: There has been much interest in differentiating diseased and normal samples using biomarkers derived from mass spectrometry (MS) studies. However, biomarker identification for specific diseases has been hindered by irreproducibility. Specifically, a peak profile extracted from a dataset...

Descripción completa

Detalles Bibliográficos
Autores principales: Zou, Jinfeng, Hong, Guini, Guo, Xinwu, Zhang, Lin, Yao, Chen, Wang, Jing, Guo, Zheng
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3194809/
https://www.ncbi.nlm.nih.gov/pubmed/22022591
http://dx.doi.org/10.1371/journal.pone.0026294
_version_ 1782214052984913920
author Zou, Jinfeng
Hong, Guini
Guo, Xinwu
Zhang, Lin
Yao, Chen
Wang, Jing
Guo, Zheng
author_facet Zou, Jinfeng
Hong, Guini
Guo, Xinwu
Zhang, Lin
Yao, Chen
Wang, Jing
Guo, Zheng
author_sort Zou, Jinfeng
collection PubMed
description BACKGROUND: There has been much interest in differentiating diseased and normal samples using biomarkers derived from mass spectrometry (MS) studies. However, biomarker identification for specific diseases has been hindered by irreproducibility. Specifically, a peak profile extracted from a dataset for biomarker identification depends on a data pre-processing algorithm. Until now, no widely accepted agreement has been reached. RESULTS: In this paper, we investigated the consistency of biomarker identification using differentially expressed (DE) peaks from peak profiles produced by three widely used average spectrum-dependent pre-processing algorithms based on SELDI-TOF MS data for prostate and breast cancers. Our results revealed two important factors that affect the consistency of DE peak identification using different algorithms. One factor is that some DE peaks selected from one peak profile were not detected as peaks in other profiles, and the second factor is that the statistical power of identifying DE peaks in large peak profiles with many peaks may be low due to the large scale of the tests and small number of samples. Furthermore, we demonstrated that the DE peak detection power in large profiles could be improved by the stratified false discovery rate (FDR) control approach and that the reproducibility of DE peak detection could thereby be increased. CONCLUSIONS: Comparing and evaluating pre-processing algorithms in terms of reproducibility can elucidate the relationship among different algorithms and also help in selecting a pre-processing algorithm. The DE peaks selected from small peak profiles with few peaks for a dataset tend to be reproducibly detected in large peak profiles, which suggests that a suitable pre-processing algorithm should be able to produce peaks sufficient for identifying useful and reproducible biomarkers.
format Online
Article
Text
id pubmed-3194809
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-31948092011-10-21 Reproducible Cancer Biomarker Discovery in SELDI-TOF MS Using Different Pre-Processing Algorithms Zou, Jinfeng Hong, Guini Guo, Xinwu Zhang, Lin Yao, Chen Wang, Jing Guo, Zheng PLoS One Research Article BACKGROUND: There has been much interest in differentiating diseased and normal samples using biomarkers derived from mass spectrometry (MS) studies. However, biomarker identification for specific diseases has been hindered by irreproducibility. Specifically, a peak profile extracted from a dataset for biomarker identification depends on a data pre-processing algorithm. Until now, no widely accepted agreement has been reached. RESULTS: In this paper, we investigated the consistency of biomarker identification using differentially expressed (DE) peaks from peak profiles produced by three widely used average spectrum-dependent pre-processing algorithms based on SELDI-TOF MS data for prostate and breast cancers. Our results revealed two important factors that affect the consistency of DE peak identification using different algorithms. One factor is that some DE peaks selected from one peak profile were not detected as peaks in other profiles, and the second factor is that the statistical power of identifying DE peaks in large peak profiles with many peaks may be low due to the large scale of the tests and small number of samples. Furthermore, we demonstrated that the DE peak detection power in large profiles could be improved by the stratified false discovery rate (FDR) control approach and that the reproducibility of DE peak detection could thereby be increased. CONCLUSIONS: Comparing and evaluating pre-processing algorithms in terms of reproducibility can elucidate the relationship among different algorithms and also help in selecting a pre-processing algorithm. The DE peaks selected from small peak profiles with few peaks for a dataset tend to be reproducibly detected in large peak profiles, which suggests that a suitable pre-processing algorithm should be able to produce peaks sufficient for identifying useful and reproducible biomarkers. Public Library of Science 2011-10-14 /pmc/articles/PMC3194809/ /pubmed/22022591 http://dx.doi.org/10.1371/journal.pone.0026294 Text en Zou et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Zou, Jinfeng
Hong, Guini
Guo, Xinwu
Zhang, Lin
Yao, Chen
Wang, Jing
Guo, Zheng
Reproducible Cancer Biomarker Discovery in SELDI-TOF MS Using Different Pre-Processing Algorithms
title Reproducible Cancer Biomarker Discovery in SELDI-TOF MS Using Different Pre-Processing Algorithms
title_full Reproducible Cancer Biomarker Discovery in SELDI-TOF MS Using Different Pre-Processing Algorithms
title_fullStr Reproducible Cancer Biomarker Discovery in SELDI-TOF MS Using Different Pre-Processing Algorithms
title_full_unstemmed Reproducible Cancer Biomarker Discovery in SELDI-TOF MS Using Different Pre-Processing Algorithms
title_short Reproducible Cancer Biomarker Discovery in SELDI-TOF MS Using Different Pre-Processing Algorithms
title_sort reproducible cancer biomarker discovery in seldi-tof ms using different pre-processing algorithms
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3194809/
https://www.ncbi.nlm.nih.gov/pubmed/22022591
http://dx.doi.org/10.1371/journal.pone.0026294
work_keys_str_mv AT zoujinfeng reproduciblecancerbiomarkerdiscoveryinselditofmsusingdifferentpreprocessingalgorithms
AT hongguini reproduciblecancerbiomarkerdiscoveryinselditofmsusingdifferentpreprocessingalgorithms
AT guoxinwu reproduciblecancerbiomarkerdiscoveryinselditofmsusingdifferentpreprocessingalgorithms
AT zhanglin reproduciblecancerbiomarkerdiscoveryinselditofmsusingdifferentpreprocessingalgorithms
AT yaochen reproduciblecancerbiomarkerdiscoveryinselditofmsusingdifferentpreprocessingalgorithms
AT wangjing reproduciblecancerbiomarkerdiscoveryinselditofmsusingdifferentpreprocessingalgorithms
AT guozheng reproduciblecancerbiomarkerdiscoveryinselditofmsusingdifferentpreprocessingalgorithms