Cargando…

Discriminating early- and late-stage cancers using multiple kernel learning on gene sets

MOTIVATION: Identifying molecular mechanisms that drive cancers from early to late stages is highly important to develop new preventive and therapeutic strategies. Standard machine learning algorithms could be used to discriminate early- and late-stage cancers from each other using their genomic cha...

Descripción completa

Detalles Bibliográficos
Autores principales: Rahimi, Arezou, Gönen, Mehmet
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6022595/
https://www.ncbi.nlm.nih.gov/pubmed/29949993
http://dx.doi.org/10.1093/bioinformatics/bty239
_version_ 1783335711783518208
author Rahimi, Arezou
Gönen, Mehmet
author_facet Rahimi, Arezou
Gönen, Mehmet
author_sort Rahimi, Arezou
collection PubMed
description MOTIVATION: Identifying molecular mechanisms that drive cancers from early to late stages is highly important to develop new preventive and therapeutic strategies. Standard machine learning algorithms could be used to discriminate early- and late-stage cancers from each other using their genomic characterizations. Even though these algorithms would get satisfactory predictive performance, their knowledge extraction capability would be quite restricted due to highly correlated nature of genomic data. That is why we need algorithms that can also extract relevant information about these biological mechanisms using our prior knowledge about pathways/gene sets. RESULTS: In this study, we addressed the problem of separating early- and late-stage cancers from each other using their gene expression profiles. We proposed to use a multiple kernel learning (MKL) formulation that makes use of pathways/gene sets (i) to obtain satisfactory/improved predictive performance and (ii) to identify biological mechanisms that might have an effect in cancer progression. We extensively compared our proposed MKL on gene sets algorithm against two standard machine learning algorithms, namely, random forests and support vector machines, on 20 diseases from the Cancer Genome Atlas cohorts for two different sets of experiments. Our method obtained statistically significantly better or comparable predictive performance on most of the datasets using significantly fewer gene expression features. We also showed that our algorithm was able to extract meaningful and disease-specific information that gives clues about the progression mechanism. AVAILABILITY AND IMPLEMENTATION: Our implementations of support vector machine and multiple kernel learning algorithms in R are available at https://github.com/mehmetgonen/gsbc together with the scripts that replicate the reported experiments.
format Online
Article
Text
id pubmed-6022595
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-60225952018-07-10 Discriminating early- and late-stage cancers using multiple kernel learning on gene sets Rahimi, Arezou Gönen, Mehmet Bioinformatics Ismb 2018–Intelligent Systems for Molecular Biology Proceedings MOTIVATION: Identifying molecular mechanisms that drive cancers from early to late stages is highly important to develop new preventive and therapeutic strategies. Standard machine learning algorithms could be used to discriminate early- and late-stage cancers from each other using their genomic characterizations. Even though these algorithms would get satisfactory predictive performance, their knowledge extraction capability would be quite restricted due to highly correlated nature of genomic data. That is why we need algorithms that can also extract relevant information about these biological mechanisms using our prior knowledge about pathways/gene sets. RESULTS: In this study, we addressed the problem of separating early- and late-stage cancers from each other using their gene expression profiles. We proposed to use a multiple kernel learning (MKL) formulation that makes use of pathways/gene sets (i) to obtain satisfactory/improved predictive performance and (ii) to identify biological mechanisms that might have an effect in cancer progression. We extensively compared our proposed MKL on gene sets algorithm against two standard machine learning algorithms, namely, random forests and support vector machines, on 20 diseases from the Cancer Genome Atlas cohorts for two different sets of experiments. Our method obtained statistically significantly better or comparable predictive performance on most of the datasets using significantly fewer gene expression features. We also showed that our algorithm was able to extract meaningful and disease-specific information that gives clues about the progression mechanism. AVAILABILITY AND IMPLEMENTATION: Our implementations of support vector machine and multiple kernel learning algorithms in R are available at https://github.com/mehmetgonen/gsbc together with the scripts that replicate the reported experiments. Oxford University Press 2018-07-01 2018-06-27 /pmc/articles/PMC6022595/ /pubmed/29949993 http://dx.doi.org/10.1093/bioinformatics/bty239 Text en © The Author(s) 2018. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Ismb 2018–Intelligent Systems for Molecular Biology Proceedings
Rahimi, Arezou
Gönen, Mehmet
Discriminating early- and late-stage cancers using multiple kernel learning on gene sets
title Discriminating early- and late-stage cancers using multiple kernel learning on gene sets
title_full Discriminating early- and late-stage cancers using multiple kernel learning on gene sets
title_fullStr Discriminating early- and late-stage cancers using multiple kernel learning on gene sets
title_full_unstemmed Discriminating early- and late-stage cancers using multiple kernel learning on gene sets
title_short Discriminating early- and late-stage cancers using multiple kernel learning on gene sets
title_sort discriminating early- and late-stage cancers using multiple kernel learning on gene sets
topic Ismb 2018–Intelligent Systems for Molecular Biology Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6022595/
https://www.ncbi.nlm.nih.gov/pubmed/29949993
http://dx.doi.org/10.1093/bioinformatics/bty239
work_keys_str_mv AT rahimiarezou discriminatingearlyandlatestagecancersusingmultiplekernellearningongenesets
AT gonenmehmet discriminatingearlyandlatestagecancersusingmultiplekernellearningongenesets