Cargando…

Augmented drug combination dataset to improve the performance of machine learning models predicting synergistic anticancer effects

Combination therapy has gained popularity in cancer treatment as it enhances the treatment efficacy and overcomes drug resistance. Although machine learning (ML) techniques have become an indispensable tool for discovering new drug combinations, the data on drug combination therapy currently availab...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Mengmeng, Srivastava, Gopal, Ramanujam, J., Brylinski, Michal
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Journal Experts 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10635365/
https://www.ncbi.nlm.nih.gov/pubmed/37961281
http://dx.doi.org/10.21203/rs.3.rs-3481858/v1
_version_ 1785146333616668672
author Liu, Mengmeng
Srivastava, Gopal
Ramanujam, J.
Brylinski, Michal
author_facet Liu, Mengmeng
Srivastava, Gopal
Ramanujam, J.
Brylinski, Michal
author_sort Liu, Mengmeng
collection PubMed
description Combination therapy has gained popularity in cancer treatment as it enhances the treatment efficacy and overcomes drug resistance. Although machine learning (ML) techniques have become an indispensable tool for discovering new drug combinations, the data on drug combination therapy currently available may be insufficient to build high-precision models. We developed a data augmentation protocol to unbiasedly scale up the existing anti-cancer drug synergy dataset. Using a new drug similarity metric, we augmented the synergy data by substituting a compound in a drug combination instance with another molecule that exhibits highly similar pharmacological effects. Using this protocol, we were able to upscale the AZ-DREAM Challenges dataset from 8,798 to 6,016,697 drug combinations. Comprehensive performance evaluations show that Random Forest and Gradient Boosting Trees models trained on the augmented data achieve higher accuracy than those trained solely on the original dataset. Our data augmentation protocol provides a systematic and unbiased approach to generating more diverse and larger-scale drug combination datasets, enabling the development of more precise and effective ML models. The protocol presented in this study could serve as a foundation for future research aimed at discovering novel and effective drug combinations for cancer treatment.
format Online
Article
Text
id pubmed-10635365
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher American Journal Experts
record_format MEDLINE/PubMed
spelling pubmed-106353652023-11-13 Augmented drug combination dataset to improve the performance of machine learning models predicting synergistic anticancer effects Liu, Mengmeng Srivastava, Gopal Ramanujam, J. Brylinski, Michal Res Sq Article Combination therapy has gained popularity in cancer treatment as it enhances the treatment efficacy and overcomes drug resistance. Although machine learning (ML) techniques have become an indispensable tool for discovering new drug combinations, the data on drug combination therapy currently available may be insufficient to build high-precision models. We developed a data augmentation protocol to unbiasedly scale up the existing anti-cancer drug synergy dataset. Using a new drug similarity metric, we augmented the synergy data by substituting a compound in a drug combination instance with another molecule that exhibits highly similar pharmacological effects. Using this protocol, we were able to upscale the AZ-DREAM Challenges dataset from 8,798 to 6,016,697 drug combinations. Comprehensive performance evaluations show that Random Forest and Gradient Boosting Trees models trained on the augmented data achieve higher accuracy than those trained solely on the original dataset. Our data augmentation protocol provides a systematic and unbiased approach to generating more diverse and larger-scale drug combination datasets, enabling the development of more precise and effective ML models. The protocol presented in this study could serve as a foundation for future research aimed at discovering novel and effective drug combinations for cancer treatment. American Journal Experts 2023-10-28 /pmc/articles/PMC10635365/ /pubmed/37961281 http://dx.doi.org/10.21203/rs.3.rs-3481858/v1 Text en https://creativecommons.org/licenses/by/4.0/This work is licensed under a Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0/) , which allows reusers to distribute, remix, adapt, and build upon the material in any medium or format, so long as attribution is given to the creator. The license allows for commercial use.
spellingShingle Article
Liu, Mengmeng
Srivastava, Gopal
Ramanujam, J.
Brylinski, Michal
Augmented drug combination dataset to improve the performance of machine learning models predicting synergistic anticancer effects
title Augmented drug combination dataset to improve the performance of machine learning models predicting synergistic anticancer effects
title_full Augmented drug combination dataset to improve the performance of machine learning models predicting synergistic anticancer effects
title_fullStr Augmented drug combination dataset to improve the performance of machine learning models predicting synergistic anticancer effects
title_full_unstemmed Augmented drug combination dataset to improve the performance of machine learning models predicting synergistic anticancer effects
title_short Augmented drug combination dataset to improve the performance of machine learning models predicting synergistic anticancer effects
title_sort augmented drug combination dataset to improve the performance of machine learning models predicting synergistic anticancer effects
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10635365/
https://www.ncbi.nlm.nih.gov/pubmed/37961281
http://dx.doi.org/10.21203/rs.3.rs-3481858/v1
work_keys_str_mv AT liumengmeng augmenteddrugcombinationdatasettoimprovetheperformanceofmachinelearningmodelspredictingsynergisticanticancereffects
AT srivastavagopal augmenteddrugcombinationdatasettoimprovetheperformanceofmachinelearningmodelspredictingsynergisticanticancereffects
AT ramanujamj augmenteddrugcombinationdatasettoimprovetheperformanceofmachinelearningmodelspredictingsynergisticanticancereffects
AT brylinskimichal augmenteddrugcombinationdatasettoimprovetheperformanceofmachinelearningmodelspredictingsynergisticanticancereffects