Cargando…

A data review and re-assessment of ovarian cancer serum proteomic profiling

BACKGROUND: The early detection of ovarian cancer has the potential to dramatically reduce mortality. Recently, the use of mass spectrometry to develop profiles of patient serum proteins, combined with advanced data mining algorithms has been reported as a promising method to achieve this goal. In t...

Descripción completa

Detalles Bibliográficos
Autores principales: Sorace, James M, Zhan, Min
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2003
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC165662/
https://www.ncbi.nlm.nih.gov/pubmed/12795817
http://dx.doi.org/10.1186/1471-2105-4-24
_version_ 1782120845659865088
author Sorace, James M
Zhan, Min
author_facet Sorace, James M
Zhan, Min
author_sort Sorace, James M
collection PubMed
description BACKGROUND: The early detection of ovarian cancer has the potential to dramatically reduce mortality. Recently, the use of mass spectrometry to develop profiles of patient serum proteins, combined with advanced data mining algorithms has been reported as a promising method to achieve this goal. In this report, we analyze the Ovarian Dataset 8-7-02 downloaded from the Clinical Proteomics Program Databank website, using nonparametric statistics and stepwise discriminant analysis to develop rules to diagnose patients, as well as to understand general patterns in the data that may guide future research. RESULTS: The mass spectrometry serum profiles derived from cancer and controls exhibited numerous statistical differences. For example, use of the Wilcoxon test in comparing the intensity at each of the 15,154 mass to charge (M/Z) values between the cancer and controls, resulted in the detection of 3,591 M/Z values whose intensities differed by a p-value of 10(-6 )or less. The region containing the M/Z values of greatest statistical difference between cancer and controls occurred at M/Z values less than 500. For example the M/Z values of 2.7921478 and 245.53704 could be used to significantly separate the cancer from control groups. Three other sets of M/Z values were developed using a training set that could distinguish between cancer and control subjects in a test set with 100% sensitivity and specificity. CONCLUSION: The ability to discriminate between cancer and control subjects based on the M/Z values of 2.7921478 and 245.53704 reveals the existence of a significant non-biologic experimental bias between these two groups. This bias may invalidate attempts to use this dataset to find patterns of reproducible diagnostic value. To minimize false discovery, results using mass spectrometry and data mining algorithms should be carefully reviewed and benchmarked with routine statistical methods.
format Text
id pubmed-165662
institution National Center for Biotechnology Information
language English
publishDate 2003
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-1656622003-07-20 A data review and re-assessment of ovarian cancer serum proteomic profiling Sorace, James M Zhan, Min BMC Bioinformatics Research Article BACKGROUND: The early detection of ovarian cancer has the potential to dramatically reduce mortality. Recently, the use of mass spectrometry to develop profiles of patient serum proteins, combined with advanced data mining algorithms has been reported as a promising method to achieve this goal. In this report, we analyze the Ovarian Dataset 8-7-02 downloaded from the Clinical Proteomics Program Databank website, using nonparametric statistics and stepwise discriminant analysis to develop rules to diagnose patients, as well as to understand general patterns in the data that may guide future research. RESULTS: The mass spectrometry serum profiles derived from cancer and controls exhibited numerous statistical differences. For example, use of the Wilcoxon test in comparing the intensity at each of the 15,154 mass to charge (M/Z) values between the cancer and controls, resulted in the detection of 3,591 M/Z values whose intensities differed by a p-value of 10(-6 )or less. The region containing the M/Z values of greatest statistical difference between cancer and controls occurred at M/Z values less than 500. For example the M/Z values of 2.7921478 and 245.53704 could be used to significantly separate the cancer from control groups. Three other sets of M/Z values were developed using a training set that could distinguish between cancer and control subjects in a test set with 100% sensitivity and specificity. CONCLUSION: The ability to discriminate between cancer and control subjects based on the M/Z values of 2.7921478 and 245.53704 reveals the existence of a significant non-biologic experimental bias between these two groups. This bias may invalidate attempts to use this dataset to find patterns of reproducible diagnostic value. To minimize false discovery, results using mass spectrometry and data mining algorithms should be carefully reviewed and benchmarked with routine statistical methods. BioMed Central 2003-06-09 /pmc/articles/PMC165662/ /pubmed/12795817 http://dx.doi.org/10.1186/1471-2105-4-24 Text en Copyright © 2003 Sorace and Zhan; licensee BioMed Central Ltd. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose, provided this notice is preserved along with the article's original URL.
spellingShingle Research Article
Sorace, James M
Zhan, Min
A data review and re-assessment of ovarian cancer serum proteomic profiling
title A data review and re-assessment of ovarian cancer serum proteomic profiling
title_full A data review and re-assessment of ovarian cancer serum proteomic profiling
title_fullStr A data review and re-assessment of ovarian cancer serum proteomic profiling
title_full_unstemmed A data review and re-assessment of ovarian cancer serum proteomic profiling
title_short A data review and re-assessment of ovarian cancer serum proteomic profiling
title_sort data review and re-assessment of ovarian cancer serum proteomic profiling
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC165662/
https://www.ncbi.nlm.nih.gov/pubmed/12795817
http://dx.doi.org/10.1186/1471-2105-4-24
work_keys_str_mv AT soracejamesm adatareviewandreassessmentofovariancancerserumproteomicprofiling
AT zhanmin adatareviewandreassessmentofovariancancerserumproteomicprofiling
AT soracejamesm datareviewandreassessmentofovariancancerserumproteomicprofiling
AT zhanmin datareviewandreassessmentofovariancancerserumproteomicprofiling