Cargando…
Discovery of moiety preference by Shapley value in protein kinase family using random forest models
BACKGROUND: Human protein kinases play important roles in cancers, are highly co-regulated by kinase families rather than a single kinase, and complementarily regulate signaling pathways. Even though there are > 100,000 protein kinase inhibitors, only 67 kinase drugs are currently approved by the...
Autores principales: | , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9011936/ https://www.ncbi.nlm.nih.gov/pubmed/35428180 http://dx.doi.org/10.1186/s12859-022-04663-5 |
_version_ | 1784687701532868608 |
---|---|
author | Huang, Yu-Wei Hsu, Yen-Chao Chuang, Yi-Hsuan Chen, Yun-Ti Lin, Xiang-Yu Fan, You-Wei Pathak, Nikhil Yang, Jinn-Moon |
author_facet | Huang, Yu-Wei Hsu, Yen-Chao Chuang, Yi-Hsuan Chen, Yun-Ti Lin, Xiang-Yu Fan, You-Wei Pathak, Nikhil Yang, Jinn-Moon |
author_sort | Huang, Yu-Wei |
collection | PubMed |
description | BACKGROUND: Human protein kinases play important roles in cancers, are highly co-regulated by kinase families rather than a single kinase, and complementarily regulate signaling pathways. Even though there are > 100,000 protein kinase inhibitors, only 67 kinase drugs are currently approved by the Food and Drug Administration (FDA). RESULTS: In this study, we used “merged moiety-based interpretable features (MMIFs),” which merged four moiety-based compound features, including Checkmol fingerprint, PubChem fingerprint, rings in drugs, and in-house moieties as the input features for building random forest (RF) models. By using > 200,000 bioactivity test data, we classified inhibitors as kinase family inhibitors or non-inhibitors in the machine learning. The results showed that our RF models achieved good accuracy (> 0.8) for the 10 kinase families. In addition, we found kinase common and specific moieties across families using the Shapley Additive exPlanations (SHAP) approach. We also verified our results using protein kinase complex structures containing important interactions of the hinges, DFGs, or P-loops in the ATP pocket of active sites. CONCLUSIONS: In summary, we not only constructed highly accurate prediction models for predicting inhibitors of kinase families but also discovered common and specific inhibitor moieties between different kinase families, providing new opportunities for designing protein kinase inhibitors. |
format | Online Article Text |
id | pubmed-9011936 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-90119362022-04-16 Discovery of moiety preference by Shapley value in protein kinase family using random forest models Huang, Yu-Wei Hsu, Yen-Chao Chuang, Yi-Hsuan Chen, Yun-Ti Lin, Xiang-Yu Fan, You-Wei Pathak, Nikhil Yang, Jinn-Moon BMC Bioinformatics Research BACKGROUND: Human protein kinases play important roles in cancers, are highly co-regulated by kinase families rather than a single kinase, and complementarily regulate signaling pathways. Even though there are > 100,000 protein kinase inhibitors, only 67 kinase drugs are currently approved by the Food and Drug Administration (FDA). RESULTS: In this study, we used “merged moiety-based interpretable features (MMIFs),” which merged four moiety-based compound features, including Checkmol fingerprint, PubChem fingerprint, rings in drugs, and in-house moieties as the input features for building random forest (RF) models. By using > 200,000 bioactivity test data, we classified inhibitors as kinase family inhibitors or non-inhibitors in the machine learning. The results showed that our RF models achieved good accuracy (> 0.8) for the 10 kinase families. In addition, we found kinase common and specific moieties across families using the Shapley Additive exPlanations (SHAP) approach. We also verified our results using protein kinase complex structures containing important interactions of the hinges, DFGs, or P-loops in the ATP pocket of active sites. CONCLUSIONS: In summary, we not only constructed highly accurate prediction models for predicting inhibitors of kinase families but also discovered common and specific inhibitor moieties between different kinase families, providing new opportunities for designing protein kinase inhibitors. BioMed Central 2022-04-15 /pmc/articles/PMC9011936/ /pubmed/35428180 http://dx.doi.org/10.1186/s12859-022-04663-5 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Huang, Yu-Wei Hsu, Yen-Chao Chuang, Yi-Hsuan Chen, Yun-Ti Lin, Xiang-Yu Fan, You-Wei Pathak, Nikhil Yang, Jinn-Moon Discovery of moiety preference by Shapley value in protein kinase family using random forest models |
title | Discovery of moiety preference by Shapley value in protein kinase family using random forest models |
title_full | Discovery of moiety preference by Shapley value in protein kinase family using random forest models |
title_fullStr | Discovery of moiety preference by Shapley value in protein kinase family using random forest models |
title_full_unstemmed | Discovery of moiety preference by Shapley value in protein kinase family using random forest models |
title_short | Discovery of moiety preference by Shapley value in protein kinase family using random forest models |
title_sort | discovery of moiety preference by shapley value in protein kinase family using random forest models |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9011936/ https://www.ncbi.nlm.nih.gov/pubmed/35428180 http://dx.doi.org/10.1186/s12859-022-04663-5 |
work_keys_str_mv | AT huangyuwei discoveryofmoietypreferencebyshapleyvalueinproteinkinasefamilyusingrandomforestmodels AT hsuyenchao discoveryofmoietypreferencebyshapleyvalueinproteinkinasefamilyusingrandomforestmodels AT chuangyihsuan discoveryofmoietypreferencebyshapleyvalueinproteinkinasefamilyusingrandomforestmodels AT chenyunti discoveryofmoietypreferencebyshapleyvalueinproteinkinasefamilyusingrandomforestmodels AT linxiangyu discoveryofmoietypreferencebyshapleyvalueinproteinkinasefamilyusingrandomforestmodels AT fanyouwei discoveryofmoietypreferencebyshapleyvalueinproteinkinasefamilyusingrandomforestmodels AT pathaknikhil discoveryofmoietypreferencebyshapleyvalueinproteinkinasefamilyusingrandomforestmodels AT yangjinnmoon discoveryofmoietypreferencebyshapleyvalueinproteinkinasefamilyusingrandomforestmodels |