Cargando…

A scale space approach for unsupervised feature selection in mass spectra classification for ovarian cancer detection

BACKGROUND: Mass spectrometry spectra, widely used in proteomics studies as a screening tool for protein profiling and to detect discriminatory signals, are high dimensional data. A large number of local maxima (a.k.a. peaks) have to be analyzed as part of computational pipelines aimed at the realiz...

Descripción completa

Detalles Bibliográficos
Autores principales: Ceccarelli, Michele, d'Acierno, Antonio, Facchiano, Angelo
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2762074/
https://www.ncbi.nlm.nih.gov/pubmed/19828085
http://dx.doi.org/10.1186/1471-2105-10-S12-S9
_version_ 1782172891829239808
author Ceccarelli, Michele
d'Acierno, Antonio
Facchiano, Angelo
author_facet Ceccarelli, Michele
d'Acierno, Antonio
Facchiano, Angelo
author_sort Ceccarelli, Michele
collection PubMed
description BACKGROUND: Mass spectrometry spectra, widely used in proteomics studies as a screening tool for protein profiling and to detect discriminatory signals, are high dimensional data. A large number of local maxima (a.k.a. peaks) have to be analyzed as part of computational pipelines aimed at the realization of efficient predictive and screening protocols. With this kind of data dimensions and samples size the risk of over-fitting and selection bias is pervasive. Therefore the development of bio-informatics methods based on unsupervised feature extraction can lead to general tools which can be applied to several fields of predictive proteomics. RESULTS: We propose a method for feature selection and extraction grounded on the theory of multi-scale spaces for high resolution spectra derived from analysis of serum. Then we use support vector machines for classification. In particular we use a database containing 216 samples spectra divided in 115 cancer and 91 control samples. The overall accuracy averaged over a large cross validation study is 98.18. The area under the ROC curve of the best selected model is 0.9962. CONCLUSION: We improved previous known results on the problem on the same data, with the advantage that the proposed method has an unsupervised feature selection phase. All the developed code, as MATLAB scripts, can be downloaded from
format Text
id pubmed-2762074
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-27620742009-10-15 A scale space approach for unsupervised feature selection in mass spectra classification for ovarian cancer detection Ceccarelli, Michele d'Acierno, Antonio Facchiano, Angelo BMC Bioinformatics Research BACKGROUND: Mass spectrometry spectra, widely used in proteomics studies as a screening tool for protein profiling and to detect discriminatory signals, are high dimensional data. A large number of local maxima (a.k.a. peaks) have to be analyzed as part of computational pipelines aimed at the realization of efficient predictive and screening protocols. With this kind of data dimensions and samples size the risk of over-fitting and selection bias is pervasive. Therefore the development of bio-informatics methods based on unsupervised feature extraction can lead to general tools which can be applied to several fields of predictive proteomics. RESULTS: We propose a method for feature selection and extraction grounded on the theory of multi-scale spaces for high resolution spectra derived from analysis of serum. Then we use support vector machines for classification. In particular we use a database containing 216 samples spectra divided in 115 cancer and 91 control samples. The overall accuracy averaged over a large cross validation study is 98.18. The area under the ROC curve of the best selected model is 0.9962. CONCLUSION: We improved previous known results on the problem on the same data, with the advantage that the proposed method has an unsupervised feature selection phase. All the developed code, as MATLAB scripts, can be downloaded from BioMed Central 2009-10-15 /pmc/articles/PMC2762074/ /pubmed/19828085 http://dx.doi.org/10.1186/1471-2105-10-S12-S9 Text en Copyright © 2009 Ceccarelli et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Ceccarelli, Michele
d'Acierno, Antonio
Facchiano, Angelo
A scale space approach for unsupervised feature selection in mass spectra classification for ovarian cancer detection
title A scale space approach for unsupervised feature selection in mass spectra classification for ovarian cancer detection
title_full A scale space approach for unsupervised feature selection in mass spectra classification for ovarian cancer detection
title_fullStr A scale space approach for unsupervised feature selection in mass spectra classification for ovarian cancer detection
title_full_unstemmed A scale space approach for unsupervised feature selection in mass spectra classification for ovarian cancer detection
title_short A scale space approach for unsupervised feature selection in mass spectra classification for ovarian cancer detection
title_sort scale space approach for unsupervised feature selection in mass spectra classification for ovarian cancer detection
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2762074/
https://www.ncbi.nlm.nih.gov/pubmed/19828085
http://dx.doi.org/10.1186/1471-2105-10-S12-S9
work_keys_str_mv AT ceccarellimichele ascalespaceapproachforunsupervisedfeatureselectioninmassspectraclassificationforovariancancerdetection
AT daciernoantonio ascalespaceapproachforunsupervisedfeatureselectioninmassspectraclassificationforovariancancerdetection
AT facchianoangelo ascalespaceapproachforunsupervisedfeatureselectioninmassspectraclassificationforovariancancerdetection
AT ceccarellimichele scalespaceapproachforunsupervisedfeatureselectioninmassspectraclassificationforovariancancerdetection
AT daciernoantonio scalespaceapproachforunsupervisedfeatureselectioninmassspectraclassificationforovariancancerdetection
AT facchianoangelo scalespaceapproachforunsupervisedfeatureselectioninmassspectraclassificationforovariancancerdetection