Cargando…
Computing Molecular Signatures as Optima of a Bi-Objective Function: Method and Application to Prediction in Oncogenomics
BACKGROUND: Filter feature selection methods compute molecular signatures by selecting subsets of genes in the ranking of a valuation function. The motivations of the valuation functions choice are almost always clearly stated, but those for selecting the genes according to their ranking are hardly...
Autores principales: | , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Libertas Academica
2015
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4426938/ https://www.ncbi.nlm.nih.gov/pubmed/25983540 http://dx.doi.org/10.4137/CIN.S21111 |
_version_ | 1782370655082119168 |
---|---|
author | Gardeux, Vincent Chelouah, Rachid Wanderley, Maria F Barbosa Siarry, Patrick Braga, Antônio P Reyal, Fabien Rouzier, Roman Pusztai, Lajos Natowicz, René |
author_facet | Gardeux, Vincent Chelouah, Rachid Wanderley, Maria F Barbosa Siarry, Patrick Braga, Antônio P Reyal, Fabien Rouzier, Roman Pusztai, Lajos Natowicz, René |
author_sort | Gardeux, Vincent |
collection | PubMed |
description | BACKGROUND: Filter feature selection methods compute molecular signatures by selecting subsets of genes in the ranking of a valuation function. The motivations of the valuation functions choice are almost always clearly stated, but those for selecting the genes according to their ranking are hardly ever explicit. METHOD: We addressed the computation of molecular signatures by searching the optima of a bi-objective function whose solution space was the set of all possible molecular signatures, ie, the set of subsets of genes. The two objectives were the size of the signature–to be minimized–and the interclass distance induced by the signature–to be maximized–. RESULTS: We showed that: 1) the convex combination of the two objectives had exactly n optimal non empty signatures where n was the number of genes, 2) the n optimal signatures were nested, and 3) the optimal signature of size k was the subset of k top ranked genes that contributed the most to the interclass distance. We applied our feature selection method on five public datasets in oncology, and assessed the prediction performances of the optimal signatures as input to the diagonal linear discriminant analysis (DLDA) classifier. They were at the same level or better than the best-reported ones. The predictions were robust, and the signatures were almost always significantly smaller. We studied in more details the performances of our predictive modeling on two breast cancer datasets to predict the response to a preoperative chemotherapy: the performances were higher than the previously reported ones, the signatures were three times smaller (11 versus 30 gene signatures), and the genes member of the signature were known to be involved in the response to chemotherapy. CONCLUSIONS: Defining molecular signatures as the optima of a bi-objective function that combined the signature size and the interclass distance was well founded and efficient for prediction in oncogenomics. The complexity of the computation was very low because the optimal signatures were the sets of genes in the ranking of their valuation. Software can be freely downloaded from http://gardeux-vincent.eu/DeltaRanking.php |
format | Online Article Text |
id | pubmed-4426938 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2015 |
publisher | Libertas Academica |
record_format | MEDLINE/PubMed |
spelling | pubmed-44269382015-05-15 Computing Molecular Signatures as Optima of a Bi-Objective Function: Method and Application to Prediction in Oncogenomics Gardeux, Vincent Chelouah, Rachid Wanderley, Maria F Barbosa Siarry, Patrick Braga, Antônio P Reyal, Fabien Rouzier, Roman Pusztai, Lajos Natowicz, René Cancer Inform Methodology BACKGROUND: Filter feature selection methods compute molecular signatures by selecting subsets of genes in the ranking of a valuation function. The motivations of the valuation functions choice are almost always clearly stated, but those for selecting the genes according to their ranking are hardly ever explicit. METHOD: We addressed the computation of molecular signatures by searching the optima of a bi-objective function whose solution space was the set of all possible molecular signatures, ie, the set of subsets of genes. The two objectives were the size of the signature–to be minimized–and the interclass distance induced by the signature–to be maximized–. RESULTS: We showed that: 1) the convex combination of the two objectives had exactly n optimal non empty signatures where n was the number of genes, 2) the n optimal signatures were nested, and 3) the optimal signature of size k was the subset of k top ranked genes that contributed the most to the interclass distance. We applied our feature selection method on five public datasets in oncology, and assessed the prediction performances of the optimal signatures as input to the diagonal linear discriminant analysis (DLDA) classifier. They were at the same level or better than the best-reported ones. The predictions were robust, and the signatures were almost always significantly smaller. We studied in more details the performances of our predictive modeling on two breast cancer datasets to predict the response to a preoperative chemotherapy: the performances were higher than the previously reported ones, the signatures were three times smaller (11 versus 30 gene signatures), and the genes member of the signature were known to be involved in the response to chemotherapy. CONCLUSIONS: Defining molecular signatures as the optima of a bi-objective function that combined the signature size and the interclass distance was well founded and efficient for prediction in oncogenomics. The complexity of the computation was very low because the optimal signatures were the sets of genes in the ranking of their valuation. Software can be freely downloaded from http://gardeux-vincent.eu/DeltaRanking.php Libertas Academica 2015-04-19 /pmc/articles/PMC4426938/ /pubmed/25983540 http://dx.doi.org/10.4137/CIN.S21111 Text en © 2015 the author(s), publisher and licensee Libertas Academica Ltd. This is an open-access article distributed under the terms of the Creative Commons CC-BY-NC 3.0 License. |
spellingShingle | Methodology Gardeux, Vincent Chelouah, Rachid Wanderley, Maria F Barbosa Siarry, Patrick Braga, Antônio P Reyal, Fabien Rouzier, Roman Pusztai, Lajos Natowicz, René Computing Molecular Signatures as Optima of a Bi-Objective Function: Method and Application to Prediction in Oncogenomics |
title | Computing Molecular Signatures as Optima of a Bi-Objective Function: Method and Application to Prediction in Oncogenomics |
title_full | Computing Molecular Signatures as Optima of a Bi-Objective Function: Method and Application to Prediction in Oncogenomics |
title_fullStr | Computing Molecular Signatures as Optima of a Bi-Objective Function: Method and Application to Prediction in Oncogenomics |
title_full_unstemmed | Computing Molecular Signatures as Optima of a Bi-Objective Function: Method and Application to Prediction in Oncogenomics |
title_short | Computing Molecular Signatures as Optima of a Bi-Objective Function: Method and Application to Prediction in Oncogenomics |
title_sort | computing molecular signatures as optima of a bi-objective function: method and application to prediction in oncogenomics |
topic | Methodology |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4426938/ https://www.ncbi.nlm.nih.gov/pubmed/25983540 http://dx.doi.org/10.4137/CIN.S21111 |
work_keys_str_mv | AT gardeuxvincent computingmolecularsignaturesasoptimaofabiobjectivefunctionmethodandapplicationtopredictioninoncogenomics AT chelouahrachid computingmolecularsignaturesasoptimaofabiobjectivefunctionmethodandapplicationtopredictioninoncogenomics AT wanderleymariafbarbosa computingmolecularsignaturesasoptimaofabiobjectivefunctionmethodandapplicationtopredictioninoncogenomics AT siarrypatrick computingmolecularsignaturesasoptimaofabiobjectivefunctionmethodandapplicationtopredictioninoncogenomics AT bragaantoniop computingmolecularsignaturesasoptimaofabiobjectivefunctionmethodandapplicationtopredictioninoncogenomics AT reyalfabien computingmolecularsignaturesasoptimaofabiobjectivefunctionmethodandapplicationtopredictioninoncogenomics AT rouzierroman computingmolecularsignaturesasoptimaofabiobjectivefunctionmethodandapplicationtopredictioninoncogenomics AT pusztailajos computingmolecularsignaturesasoptimaofabiobjectivefunctionmethodandapplicationtopredictioninoncogenomics AT natowiczrene computingmolecularsignaturesasoptimaofabiobjectivefunctionmethodandapplicationtopredictioninoncogenomics |