Cargando…

Comparison of classification methods that combine clinical data and high-dimensional mass spectrometry data

BACKGROUND: The identification of new diagnostic or prognostic biomarkers is one of the main aims of clinical cancer research. Technologies like mass spectrometry are commonly being used in proteomic research. Mass spectrometry signals show the proteomic profiles of the individuals under study at a...

Descripción completa

Detalles Bibliográficos
Autores principales: Truntzer, Caroline, Mostacci, Elise, Jeannin, Aline, Petit, Jean-Michel, Ducoroy, Patrick, Cardot, Hervé
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4261611/
https://www.ncbi.nlm.nih.gov/pubmed/25432156
http://dx.doi.org/10.1186/s12859-014-0385-z
_version_ 1782348303325724672
author Truntzer, Caroline
Mostacci, Elise
Jeannin, Aline
Petit, Jean-Michel
Ducoroy, Patrick
Cardot, Hervé
author_facet Truntzer, Caroline
Mostacci, Elise
Jeannin, Aline
Petit, Jean-Michel
Ducoroy, Patrick
Cardot, Hervé
author_sort Truntzer, Caroline
collection PubMed
description BACKGROUND: The identification of new diagnostic or prognostic biomarkers is one of the main aims of clinical cancer research. Technologies like mass spectrometry are commonly being used in proteomic research. Mass spectrometry signals show the proteomic profiles of the individuals under study at a given time. These profiles correspond to the recording of a large number of proteins, much larger than the number of individuals. These variables come in addition to or to complete classical clinical variables. The objective of this study is to evaluate and compare the predictive ability of new and existing models combining mass spectrometry data and classical clinical variables. This study was conducted in the context of binary prediction. RESULTS: To achieve this goal, simulated data as well as a real dataset dedicated to the selection of proteomic markers of steatosis were used to evaluate the methods. The proposed methods meet the challenge of high-dimensional data and the selection of predictive markers by using penalization methods (Ridge, Lasso) and dimension reduction techniques (PLS), as well as a combination of both strategies through sparse PLS in the context of a binary class prediction. The methods were compared in terms of mean classification rate and their ability to select the true predictive values. These comparisons were done on clinical-only models, mass-spectrometry-only models and combined models. CONCLUSIONS: It was shown that models which combine both types of data can be more efficient than models that use only clinical or mass spectrometry data when the sample size of the dataset is large enough.
format Online
Article
Text
id pubmed-4261611
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-42616112014-12-10 Comparison of classification methods that combine clinical data and high-dimensional mass spectrometry data Truntzer, Caroline Mostacci, Elise Jeannin, Aline Petit, Jean-Michel Ducoroy, Patrick Cardot, Hervé BMC Bioinformatics Research Article BACKGROUND: The identification of new diagnostic or prognostic biomarkers is one of the main aims of clinical cancer research. Technologies like mass spectrometry are commonly being used in proteomic research. Mass spectrometry signals show the proteomic profiles of the individuals under study at a given time. These profiles correspond to the recording of a large number of proteins, much larger than the number of individuals. These variables come in addition to or to complete classical clinical variables. The objective of this study is to evaluate and compare the predictive ability of new and existing models combining mass spectrometry data and classical clinical variables. This study was conducted in the context of binary prediction. RESULTS: To achieve this goal, simulated data as well as a real dataset dedicated to the selection of proteomic markers of steatosis were used to evaluate the methods. The proposed methods meet the challenge of high-dimensional data and the selection of predictive markers by using penalization methods (Ridge, Lasso) and dimension reduction techniques (PLS), as well as a combination of both strategies through sparse PLS in the context of a binary class prediction. The methods were compared in terms of mean classification rate and their ability to select the true predictive values. These comparisons were done on clinical-only models, mass-spectrometry-only models and combined models. CONCLUSIONS: It was shown that models which combine both types of data can be more efficient than models that use only clinical or mass spectrometry data when the sample size of the dataset is large enough. BioMed Central 2014-11-29 /pmc/articles/PMC4261611/ /pubmed/25432156 http://dx.doi.org/10.1186/s12859-014-0385-z Text en © Truntzer et al.; licensee BioMed Central Ltd. 2014 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Truntzer, Caroline
Mostacci, Elise
Jeannin, Aline
Petit, Jean-Michel
Ducoroy, Patrick
Cardot, Hervé
Comparison of classification methods that combine clinical data and high-dimensional mass spectrometry data
title Comparison of classification methods that combine clinical data and high-dimensional mass spectrometry data
title_full Comparison of classification methods that combine clinical data and high-dimensional mass spectrometry data
title_fullStr Comparison of classification methods that combine clinical data and high-dimensional mass spectrometry data
title_full_unstemmed Comparison of classification methods that combine clinical data and high-dimensional mass spectrometry data
title_short Comparison of classification methods that combine clinical data and high-dimensional mass spectrometry data
title_sort comparison of classification methods that combine clinical data and high-dimensional mass spectrometry data
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4261611/
https://www.ncbi.nlm.nih.gov/pubmed/25432156
http://dx.doi.org/10.1186/s12859-014-0385-z
work_keys_str_mv AT truntzercaroline comparisonofclassificationmethodsthatcombineclinicaldataandhighdimensionalmassspectrometrydata
AT mostaccielise comparisonofclassificationmethodsthatcombineclinicaldataandhighdimensionalmassspectrometrydata
AT jeanninaline comparisonofclassificationmethodsthatcombineclinicaldataandhighdimensionalmassspectrometrydata
AT petitjeanmichel comparisonofclassificationmethodsthatcombineclinicaldataandhighdimensionalmassspectrometrydata
AT ducoroypatrick comparisonofclassificationmethodsthatcombineclinicaldataandhighdimensionalmassspectrometrydata
AT cardotherve comparisonofclassificationmethodsthatcombineclinicaldataandhighdimensionalmassspectrometrydata