Cargando…
Reproducible Cancer Biomarker Discovery in SELDI-TOF MS Using Different Pre-Processing Algorithms
BACKGROUND: There has been much interest in differentiating diseased and normal samples using biomarkers derived from mass spectrometry (MS) studies. However, biomarker identification for specific diseases has been hindered by irreproducibility. Specifically, a peak profile extracted from a dataset...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2011
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3194809/ https://www.ncbi.nlm.nih.gov/pubmed/22022591 http://dx.doi.org/10.1371/journal.pone.0026294 |
_version_ | 1782214052984913920 |
---|---|
author | Zou, Jinfeng Hong, Guini Guo, Xinwu Zhang, Lin Yao, Chen Wang, Jing Guo, Zheng |
author_facet | Zou, Jinfeng Hong, Guini Guo, Xinwu Zhang, Lin Yao, Chen Wang, Jing Guo, Zheng |
author_sort | Zou, Jinfeng |
collection | PubMed |
description | BACKGROUND: There has been much interest in differentiating diseased and normal samples using biomarkers derived from mass spectrometry (MS) studies. However, biomarker identification for specific diseases has been hindered by irreproducibility. Specifically, a peak profile extracted from a dataset for biomarker identification depends on a data pre-processing algorithm. Until now, no widely accepted agreement has been reached. RESULTS: In this paper, we investigated the consistency of biomarker identification using differentially expressed (DE) peaks from peak profiles produced by three widely used average spectrum-dependent pre-processing algorithms based on SELDI-TOF MS data for prostate and breast cancers. Our results revealed two important factors that affect the consistency of DE peak identification using different algorithms. One factor is that some DE peaks selected from one peak profile were not detected as peaks in other profiles, and the second factor is that the statistical power of identifying DE peaks in large peak profiles with many peaks may be low due to the large scale of the tests and small number of samples. Furthermore, we demonstrated that the DE peak detection power in large profiles could be improved by the stratified false discovery rate (FDR) control approach and that the reproducibility of DE peak detection could thereby be increased. CONCLUSIONS: Comparing and evaluating pre-processing algorithms in terms of reproducibility can elucidate the relationship among different algorithms and also help in selecting a pre-processing algorithm. The DE peaks selected from small peak profiles with few peaks for a dataset tend to be reproducibly detected in large peak profiles, which suggests that a suitable pre-processing algorithm should be able to produce peaks sufficient for identifying useful and reproducible biomarkers. |
format | Online Article Text |
id | pubmed-3194809 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2011 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-31948092011-10-21 Reproducible Cancer Biomarker Discovery in SELDI-TOF MS Using Different Pre-Processing Algorithms Zou, Jinfeng Hong, Guini Guo, Xinwu Zhang, Lin Yao, Chen Wang, Jing Guo, Zheng PLoS One Research Article BACKGROUND: There has been much interest in differentiating diseased and normal samples using biomarkers derived from mass spectrometry (MS) studies. However, biomarker identification for specific diseases has been hindered by irreproducibility. Specifically, a peak profile extracted from a dataset for biomarker identification depends on a data pre-processing algorithm. Until now, no widely accepted agreement has been reached. RESULTS: In this paper, we investigated the consistency of biomarker identification using differentially expressed (DE) peaks from peak profiles produced by three widely used average spectrum-dependent pre-processing algorithms based on SELDI-TOF MS data for prostate and breast cancers. Our results revealed two important factors that affect the consistency of DE peak identification using different algorithms. One factor is that some DE peaks selected from one peak profile were not detected as peaks in other profiles, and the second factor is that the statistical power of identifying DE peaks in large peak profiles with many peaks may be low due to the large scale of the tests and small number of samples. Furthermore, we demonstrated that the DE peak detection power in large profiles could be improved by the stratified false discovery rate (FDR) control approach and that the reproducibility of DE peak detection could thereby be increased. CONCLUSIONS: Comparing and evaluating pre-processing algorithms in terms of reproducibility can elucidate the relationship among different algorithms and also help in selecting a pre-processing algorithm. The DE peaks selected from small peak profiles with few peaks for a dataset tend to be reproducibly detected in large peak profiles, which suggests that a suitable pre-processing algorithm should be able to produce peaks sufficient for identifying useful and reproducible biomarkers. Public Library of Science 2011-10-14 /pmc/articles/PMC3194809/ /pubmed/22022591 http://dx.doi.org/10.1371/journal.pone.0026294 Text en Zou et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited. |
spellingShingle | Research Article Zou, Jinfeng Hong, Guini Guo, Xinwu Zhang, Lin Yao, Chen Wang, Jing Guo, Zheng Reproducible Cancer Biomarker Discovery in SELDI-TOF MS Using Different Pre-Processing Algorithms |
title | Reproducible Cancer Biomarker Discovery in SELDI-TOF MS Using Different Pre-Processing Algorithms |
title_full | Reproducible Cancer Biomarker Discovery in SELDI-TOF MS Using Different Pre-Processing Algorithms |
title_fullStr | Reproducible Cancer Biomarker Discovery in SELDI-TOF MS Using Different Pre-Processing Algorithms |
title_full_unstemmed | Reproducible Cancer Biomarker Discovery in SELDI-TOF MS Using Different Pre-Processing Algorithms |
title_short | Reproducible Cancer Biomarker Discovery in SELDI-TOF MS Using Different Pre-Processing Algorithms |
title_sort | reproducible cancer biomarker discovery in seldi-tof ms using different pre-processing algorithms |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3194809/ https://www.ncbi.nlm.nih.gov/pubmed/22022591 http://dx.doi.org/10.1371/journal.pone.0026294 |
work_keys_str_mv | AT zoujinfeng reproduciblecancerbiomarkerdiscoveryinselditofmsusingdifferentpreprocessingalgorithms AT hongguini reproduciblecancerbiomarkerdiscoveryinselditofmsusingdifferentpreprocessingalgorithms AT guoxinwu reproduciblecancerbiomarkerdiscoveryinselditofmsusingdifferentpreprocessingalgorithms AT zhanglin reproduciblecancerbiomarkerdiscoveryinselditofmsusingdifferentpreprocessingalgorithms AT yaochen reproduciblecancerbiomarkerdiscoveryinselditofmsusingdifferentpreprocessingalgorithms AT wangjing reproduciblecancerbiomarkerdiscoveryinselditofmsusingdifferentpreprocessingalgorithms AT guozheng reproduciblecancerbiomarkerdiscoveryinselditofmsusingdifferentpreprocessingalgorithms |