Cargando…
An integrated network representation of multiple cancer-specific data for graph-based machine learning
Genomic profiles of cancer cells provide valuable information on genetic alterations in cancer. Several recent studies employed these data to predict the response of cancer cell lines to drug treatment. Nonetheless, due to the multifactorial phenotypes and intricate mechanisms of cancer, the accurat...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9054771/ https://www.ncbi.nlm.nih.gov/pubmed/35487924 http://dx.doi.org/10.1038/s41540-022-00226-9 |
_version_ | 1784697265915428864 |
---|---|
author | Pu, Limeng Singha, Manali Wu, Hsiao-Chun Busch, Costas Ramanujam, J. Brylinski, Michal |
author_facet | Pu, Limeng Singha, Manali Wu, Hsiao-Chun Busch, Costas Ramanujam, J. Brylinski, Michal |
author_sort | Pu, Limeng |
collection | PubMed |
description | Genomic profiles of cancer cells provide valuable information on genetic alterations in cancer. Several recent studies employed these data to predict the response of cancer cell lines to drug treatment. Nonetheless, due to the multifactorial phenotypes and intricate mechanisms of cancer, the accurate prediction of the effect of pharmacotherapy on a specific cell line based on the genetic information alone is problematic. Emphasizing on the system-level complexity of cancer, we devised a procedure to integrate multiple heterogeneous data, including biological networks, genomics, inhibitor profiling, and gene-disease associations, into a unified graph structure. In order to construct compact, yet information-rich cancer-specific networks, we developed a novel graph reduction algorithm. Driven by not only the topological information, but also the biological knowledge, the graph reduction increases the feature-only entropy while preserving the valuable graph-feature information. Subsequent comparative benchmarking simulations employing a tissue level cross-validation protocol demonstrate that the accuracy of a graph-based predictor of the drug efficacy is 0.68, which is notably higher than those measured for more traditional, matrix-based techniques on the same data. Overall, the non-Euclidean representation of the cancer-specific data improves the performance of machine learning to predict the response of cancer to pharmacotherapy. The generated data are freely available to the academic community at https://osf.io/dzx7b/. |
format | Online Article Text |
id | pubmed-9054771 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-90547712022-05-01 An integrated network representation of multiple cancer-specific data for graph-based machine learning Pu, Limeng Singha, Manali Wu, Hsiao-Chun Busch, Costas Ramanujam, J. Brylinski, Michal NPJ Syst Biol Appl Article Genomic profiles of cancer cells provide valuable information on genetic alterations in cancer. Several recent studies employed these data to predict the response of cancer cell lines to drug treatment. Nonetheless, due to the multifactorial phenotypes and intricate mechanisms of cancer, the accurate prediction of the effect of pharmacotherapy on a specific cell line based on the genetic information alone is problematic. Emphasizing on the system-level complexity of cancer, we devised a procedure to integrate multiple heterogeneous data, including biological networks, genomics, inhibitor profiling, and gene-disease associations, into a unified graph structure. In order to construct compact, yet information-rich cancer-specific networks, we developed a novel graph reduction algorithm. Driven by not only the topological information, but also the biological knowledge, the graph reduction increases the feature-only entropy while preserving the valuable graph-feature information. Subsequent comparative benchmarking simulations employing a tissue level cross-validation protocol demonstrate that the accuracy of a graph-based predictor of the drug efficacy is 0.68, which is notably higher than those measured for more traditional, matrix-based techniques on the same data. Overall, the non-Euclidean representation of the cancer-specific data improves the performance of machine learning to predict the response of cancer to pharmacotherapy. The generated data are freely available to the academic community at https://osf.io/dzx7b/. Nature Publishing Group UK 2022-04-29 /pmc/articles/PMC9054771/ /pubmed/35487924 http://dx.doi.org/10.1038/s41540-022-00226-9 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Article Pu, Limeng Singha, Manali Wu, Hsiao-Chun Busch, Costas Ramanujam, J. Brylinski, Michal An integrated network representation of multiple cancer-specific data for graph-based machine learning |
title | An integrated network representation of multiple cancer-specific data for graph-based machine learning |
title_full | An integrated network representation of multiple cancer-specific data for graph-based machine learning |
title_fullStr | An integrated network representation of multiple cancer-specific data for graph-based machine learning |
title_full_unstemmed | An integrated network representation of multiple cancer-specific data for graph-based machine learning |
title_short | An integrated network representation of multiple cancer-specific data for graph-based machine learning |
title_sort | integrated network representation of multiple cancer-specific data for graph-based machine learning |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9054771/ https://www.ncbi.nlm.nih.gov/pubmed/35487924 http://dx.doi.org/10.1038/s41540-022-00226-9 |
work_keys_str_mv | AT pulimeng anintegratednetworkrepresentationofmultiplecancerspecificdataforgraphbasedmachinelearning AT singhamanali anintegratednetworkrepresentationofmultiplecancerspecificdataforgraphbasedmachinelearning AT wuhsiaochun anintegratednetworkrepresentationofmultiplecancerspecificdataforgraphbasedmachinelearning AT buschcostas anintegratednetworkrepresentationofmultiplecancerspecificdataforgraphbasedmachinelearning AT ramanujamj anintegratednetworkrepresentationofmultiplecancerspecificdataforgraphbasedmachinelearning AT brylinskimichal anintegratednetworkrepresentationofmultiplecancerspecificdataforgraphbasedmachinelearning AT pulimeng integratednetworkrepresentationofmultiplecancerspecificdataforgraphbasedmachinelearning AT singhamanali integratednetworkrepresentationofmultiplecancerspecificdataforgraphbasedmachinelearning AT wuhsiaochun integratednetworkrepresentationofmultiplecancerspecificdataforgraphbasedmachinelearning AT buschcostas integratednetworkrepresentationofmultiplecancerspecificdataforgraphbasedmachinelearning AT ramanujamj integratednetworkrepresentationofmultiplecancerspecificdataforgraphbasedmachinelearning AT brylinskimichal integratednetworkrepresentationofmultiplecancerspecificdataforgraphbasedmachinelearning |