Cargando…

Biomarker signature identification in “omics” data with multi-class outcome

Biomarker signature identification in “omics” data is a complex challenge that requires specialized feature selection algorithms. The objective of these algorithms is to select the smallest set(s) of molecular quantities that are able to predict a given outcome (target) with maximal predictive perfo...

Descripción completa

Detalles Bibliográficos
Autores principales: Lagani, Vincenzo, Kortas, George, Tsamardinos, Ioannis
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Research Network of Computational and Structural Biotechnology (RNCSB) Organization 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3962136/
https://www.ncbi.nlm.nih.gov/pubmed/24688712
http://dx.doi.org/10.5936/csbj.201303004
_version_ 1782308389044355072
author Lagani, Vincenzo
Kortas, George
Tsamardinos, Ioannis
author_facet Lagani, Vincenzo
Kortas, George
Tsamardinos, Ioannis
author_sort Lagani, Vincenzo
collection PubMed
description Biomarker signature identification in “omics” data is a complex challenge that requires specialized feature selection algorithms. The objective of these algorithms is to select the smallest set(s) of molecular quantities that are able to predict a given outcome (target) with maximal predictive performance. This task is even more challenging when the outcome comprises of multiple classes; for example, one may be interested in identifying the genes whose expressions allow discrimination among different types of cancer (nominal outcome) or among different stages of the same cancer, e.g. Stage 1, 2, 3 and 4 of Lung Adenocarcinoma (ordinal outcome). In this work, we consider a particular type of successful feature selection methods, named constraint-based, local causal discovery algorithms. These algorithms depend on performing a series of conditional independence tests. We extend these algorithms for the analysis of problems with continuous predictors and multi-class outcomes, by developing and equipping them with an appropriate conditional independence test procedure for both nominal and ordinal multi-class targets. The test is based on multinomial logistic regression and employs the log-likelihood ratio test for model selection. We present a comparative, experimental evaluation on seven real-world, high-dimensional, gene-expression datasets. Within the scope of our analysis the results indicate that the new conditional independence test allows the identification of smaller and better performing signatures for multi-class outcome datasets, with respect to the current alternatives for performing the independence tests.
format Online
Article
Text
id pubmed-3962136
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher Research Network of Computational and Structural Biotechnology (RNCSB) Organization
record_format MEDLINE/PubMed
spelling pubmed-39621362014-03-31 Biomarker signature identification in “omics” data with multi-class outcome Lagani, Vincenzo Kortas, George Tsamardinos, Ioannis Comput Struct Biotechnol J Research Article Biomarker signature identification in “omics” data is a complex challenge that requires specialized feature selection algorithms. The objective of these algorithms is to select the smallest set(s) of molecular quantities that are able to predict a given outcome (target) with maximal predictive performance. This task is even more challenging when the outcome comprises of multiple classes; for example, one may be interested in identifying the genes whose expressions allow discrimination among different types of cancer (nominal outcome) or among different stages of the same cancer, e.g. Stage 1, 2, 3 and 4 of Lung Adenocarcinoma (ordinal outcome). In this work, we consider a particular type of successful feature selection methods, named constraint-based, local causal discovery algorithms. These algorithms depend on performing a series of conditional independence tests. We extend these algorithms for the analysis of problems with continuous predictors and multi-class outcomes, by developing and equipping them with an appropriate conditional independence test procedure for both nominal and ordinal multi-class targets. The test is based on multinomial logistic regression and employs the log-likelihood ratio test for model selection. We present a comparative, experimental evaluation on seven real-world, high-dimensional, gene-expression datasets. Within the scope of our analysis the results indicate that the new conditional independence test allows the identification of smaller and better performing signatures for multi-class outcome datasets, with respect to the current alternatives for performing the independence tests. Research Network of Computational and Structural Biotechnology (RNCSB) Organization 2013-06-08 /pmc/articles/PMC3962136/ /pubmed/24688712 http://dx.doi.org/10.5936/csbj.201303004 Text en © Lagani et al. http://creativecommons.org/licenses/by/3.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly cited.
spellingShingle Research Article
Lagani, Vincenzo
Kortas, George
Tsamardinos, Ioannis
Biomarker signature identification in “omics” data with multi-class outcome
title Biomarker signature identification in “omics” data with multi-class outcome
title_full Biomarker signature identification in “omics” data with multi-class outcome
title_fullStr Biomarker signature identification in “omics” data with multi-class outcome
title_full_unstemmed Biomarker signature identification in “omics” data with multi-class outcome
title_short Biomarker signature identification in “omics” data with multi-class outcome
title_sort biomarker signature identification in “omics” data with multi-class outcome
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3962136/
https://www.ncbi.nlm.nih.gov/pubmed/24688712
http://dx.doi.org/10.5936/csbj.201303004
work_keys_str_mv AT laganivincenzo biomarkersignatureidentificationinomicsdatawithmulticlassoutcome
AT kortasgeorge biomarkersignatureidentificationinomicsdatawithmulticlassoutcome
AT tsamardinosioannis biomarkersignatureidentificationinomicsdatawithmulticlassoutcome