Cargando…
Augmented drug combination dataset to improve the performance of machine learning models predicting synergistic anticancer effects
Combination therapy has gained popularity in cancer treatment as it enhances the treatment efficacy and overcomes drug resistance. Although machine learning (ML) techniques have become an indispensable tool for discovering new drug combinations, the data on drug combination therapy currently availab...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
American Journal Experts
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10635365/ https://www.ncbi.nlm.nih.gov/pubmed/37961281 http://dx.doi.org/10.21203/rs.3.rs-3481858/v1 |
_version_ | 1785146333616668672 |
---|---|
author | Liu, Mengmeng Srivastava, Gopal Ramanujam, J. Brylinski, Michal |
author_facet | Liu, Mengmeng Srivastava, Gopal Ramanujam, J. Brylinski, Michal |
author_sort | Liu, Mengmeng |
collection | PubMed |
description | Combination therapy has gained popularity in cancer treatment as it enhances the treatment efficacy and overcomes drug resistance. Although machine learning (ML) techniques have become an indispensable tool for discovering new drug combinations, the data on drug combination therapy currently available may be insufficient to build high-precision models. We developed a data augmentation protocol to unbiasedly scale up the existing anti-cancer drug synergy dataset. Using a new drug similarity metric, we augmented the synergy data by substituting a compound in a drug combination instance with another molecule that exhibits highly similar pharmacological effects. Using this protocol, we were able to upscale the AZ-DREAM Challenges dataset from 8,798 to 6,016,697 drug combinations. Comprehensive performance evaluations show that Random Forest and Gradient Boosting Trees models trained on the augmented data achieve higher accuracy than those trained solely on the original dataset. Our data augmentation protocol provides a systematic and unbiased approach to generating more diverse and larger-scale drug combination datasets, enabling the development of more precise and effective ML models. The protocol presented in this study could serve as a foundation for future research aimed at discovering novel and effective drug combinations for cancer treatment. |
format | Online Article Text |
id | pubmed-10635365 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | American Journal Experts |
record_format | MEDLINE/PubMed |
spelling | pubmed-106353652023-11-13 Augmented drug combination dataset to improve the performance of machine learning models predicting synergistic anticancer effects Liu, Mengmeng Srivastava, Gopal Ramanujam, J. Brylinski, Michal Res Sq Article Combination therapy has gained popularity in cancer treatment as it enhances the treatment efficacy and overcomes drug resistance. Although machine learning (ML) techniques have become an indispensable tool for discovering new drug combinations, the data on drug combination therapy currently available may be insufficient to build high-precision models. We developed a data augmentation protocol to unbiasedly scale up the existing anti-cancer drug synergy dataset. Using a new drug similarity metric, we augmented the synergy data by substituting a compound in a drug combination instance with another molecule that exhibits highly similar pharmacological effects. Using this protocol, we were able to upscale the AZ-DREAM Challenges dataset from 8,798 to 6,016,697 drug combinations. Comprehensive performance evaluations show that Random Forest and Gradient Boosting Trees models trained on the augmented data achieve higher accuracy than those trained solely on the original dataset. Our data augmentation protocol provides a systematic and unbiased approach to generating more diverse and larger-scale drug combination datasets, enabling the development of more precise and effective ML models. The protocol presented in this study could serve as a foundation for future research aimed at discovering novel and effective drug combinations for cancer treatment. American Journal Experts 2023-10-28 /pmc/articles/PMC10635365/ /pubmed/37961281 http://dx.doi.org/10.21203/rs.3.rs-3481858/v1 Text en https://creativecommons.org/licenses/by/4.0/This work is licensed under a Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0/) , which allows reusers to distribute, remix, adapt, and build upon the material in any medium or format, so long as attribution is given to the creator. The license allows for commercial use. |
spellingShingle | Article Liu, Mengmeng Srivastava, Gopal Ramanujam, J. Brylinski, Michal Augmented drug combination dataset to improve the performance of machine learning models predicting synergistic anticancer effects |
title | Augmented drug combination dataset to improve the performance of machine learning models predicting synergistic anticancer effects |
title_full | Augmented drug combination dataset to improve the performance of machine learning models predicting synergistic anticancer effects |
title_fullStr | Augmented drug combination dataset to improve the performance of machine learning models predicting synergistic anticancer effects |
title_full_unstemmed | Augmented drug combination dataset to improve the performance of machine learning models predicting synergistic anticancer effects |
title_short | Augmented drug combination dataset to improve the performance of machine learning models predicting synergistic anticancer effects |
title_sort | augmented drug combination dataset to improve the performance of machine learning models predicting synergistic anticancer effects |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10635365/ https://www.ncbi.nlm.nih.gov/pubmed/37961281 http://dx.doi.org/10.21203/rs.3.rs-3481858/v1 |
work_keys_str_mv | AT liumengmeng augmenteddrugcombinationdatasettoimprovetheperformanceofmachinelearningmodelspredictingsynergisticanticancereffects AT srivastavagopal augmenteddrugcombinationdatasettoimprovetheperformanceofmachinelearningmodelspredictingsynergisticanticancereffects AT ramanujamj augmenteddrugcombinationdatasettoimprovetheperformanceofmachinelearningmodelspredictingsynergisticanticancereffects AT brylinskimichal augmenteddrugcombinationdatasettoimprovetheperformanceofmachinelearningmodelspredictingsynergisticanticancereffects |