Cargando…

LogD7.4 prediction enhanced by transferring knowledge from chromatographic retention time, microscopic pKa and logP

Lipophilicity is a fundamental physical property that significantly affects various aspects of drug behavior, including solubility, permeability, metabolism, distribution, protein binding, and toxicity. Accurate prediction of lipophilicity, measured by the logD7.4 value (the distribution coefficient...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Yitian, Xiong, Jiacheng, Xiao, Fu, Zhang, Wei, Cheng, Kaiyang, Rao, Jingxin, Niu, Buying, Tong, Xiaochu, Qu, Ning, Zhang, Runze, Wang, Dingyan, Chen, Kaixian, Li, Xutong, Zheng, Mingyue
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer International Publishing 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10478446/
https://www.ncbi.nlm.nih.gov/pubmed/37670374
http://dx.doi.org/10.1186/s13321-023-00754-4
_version_ 1785101352435712000
author Wang, Yitian
Xiong, Jiacheng
Xiao, Fu
Zhang, Wei
Cheng, Kaiyang
Rao, Jingxin
Niu, Buying
Tong, Xiaochu
Qu, Ning
Zhang, Runze
Wang, Dingyan
Chen, Kaixian
Li, Xutong
Zheng, Mingyue
author_facet Wang, Yitian
Xiong, Jiacheng
Xiao, Fu
Zhang, Wei
Cheng, Kaiyang
Rao, Jingxin
Niu, Buying
Tong, Xiaochu
Qu, Ning
Zhang, Runze
Wang, Dingyan
Chen, Kaixian
Li, Xutong
Zheng, Mingyue
author_sort Wang, Yitian
collection PubMed
description Lipophilicity is a fundamental physical property that significantly affects various aspects of drug behavior, including solubility, permeability, metabolism, distribution, protein binding, and toxicity. Accurate prediction of lipophilicity, measured by the logD7.4 value (the distribution coefficient between n-octanol and buffer at physiological pH 7.4), is crucial for successful drug discovery and design. However, the limited availability of data for logD modeling poses a significant challenge to achieving satisfactory generalization capability. To address this challenge, we have developed a novel logD7.4 prediction model called RTlogD, which leverages knowledge from multiple sources. RTlogD combines pre-training on a chromatographic retention time (RT) dataset since the RT is influenced by lipophilicity. Additionally, microscopic pKa values are incorporated as atomic features, providing valuable insights into ionizable sites and ionization capacity. Furthermore, logP is integrated as an auxiliary task within a multitask learning framework. We conducted ablation studies and presented a detailed analysis, showcasing the effectiveness and interpretability of RT, pKa, and logP in the RTlogD model. Notably, our RTlogD model demonstrated superior performance compared to commonly used algorithms and prediction tools. These results underscore the potential of the RTlogD model to improve the accuracy and generalization of logD prediction in drug discovery and design. In summary, the RTlogD model addresses the challenge of limited data availability in logD modeling by leveraging knowledge from RT, microscopic pKa, and logP. Incorporating these factors enhances the predictive capabilities of our model, and it holds promise for real-world applications in drug discovery and design scenarios. GRAPHICAL ABSTRACT: [Image: see text] SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13321-023-00754-4.
format Online
Article
Text
id pubmed-10478446
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Springer International Publishing
record_format MEDLINE/PubMed
spelling pubmed-104784462023-09-06 LogD7.4 prediction enhanced by transferring knowledge from chromatographic retention time, microscopic pKa and logP Wang, Yitian Xiong, Jiacheng Xiao, Fu Zhang, Wei Cheng, Kaiyang Rao, Jingxin Niu, Buying Tong, Xiaochu Qu, Ning Zhang, Runze Wang, Dingyan Chen, Kaixian Li, Xutong Zheng, Mingyue J Cheminform Research Lipophilicity is a fundamental physical property that significantly affects various aspects of drug behavior, including solubility, permeability, metabolism, distribution, protein binding, and toxicity. Accurate prediction of lipophilicity, measured by the logD7.4 value (the distribution coefficient between n-octanol and buffer at physiological pH 7.4), is crucial for successful drug discovery and design. However, the limited availability of data for logD modeling poses a significant challenge to achieving satisfactory generalization capability. To address this challenge, we have developed a novel logD7.4 prediction model called RTlogD, which leverages knowledge from multiple sources. RTlogD combines pre-training on a chromatographic retention time (RT) dataset since the RT is influenced by lipophilicity. Additionally, microscopic pKa values are incorporated as atomic features, providing valuable insights into ionizable sites and ionization capacity. Furthermore, logP is integrated as an auxiliary task within a multitask learning framework. We conducted ablation studies and presented a detailed analysis, showcasing the effectiveness and interpretability of RT, pKa, and logP in the RTlogD model. Notably, our RTlogD model demonstrated superior performance compared to commonly used algorithms and prediction tools. These results underscore the potential of the RTlogD model to improve the accuracy and generalization of logD prediction in drug discovery and design. In summary, the RTlogD model addresses the challenge of limited data availability in logD modeling by leveraging knowledge from RT, microscopic pKa, and logP. Incorporating these factors enhances the predictive capabilities of our model, and it holds promise for real-world applications in drug discovery and design scenarios. GRAPHICAL ABSTRACT: [Image: see text] SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13321-023-00754-4. Springer International Publishing 2023-09-05 /pmc/articles/PMC10478446/ /pubmed/37670374 http://dx.doi.org/10.1186/s13321-023-00754-4 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Wang, Yitian
Xiong, Jiacheng
Xiao, Fu
Zhang, Wei
Cheng, Kaiyang
Rao, Jingxin
Niu, Buying
Tong, Xiaochu
Qu, Ning
Zhang, Runze
Wang, Dingyan
Chen, Kaixian
Li, Xutong
Zheng, Mingyue
LogD7.4 prediction enhanced by transferring knowledge from chromatographic retention time, microscopic pKa and logP
title LogD7.4 prediction enhanced by transferring knowledge from chromatographic retention time, microscopic pKa and logP
title_full LogD7.4 prediction enhanced by transferring knowledge from chromatographic retention time, microscopic pKa and logP
title_fullStr LogD7.4 prediction enhanced by transferring knowledge from chromatographic retention time, microscopic pKa and logP
title_full_unstemmed LogD7.4 prediction enhanced by transferring knowledge from chromatographic retention time, microscopic pKa and logP
title_short LogD7.4 prediction enhanced by transferring knowledge from chromatographic retention time, microscopic pKa and logP
title_sort logd7.4 prediction enhanced by transferring knowledge from chromatographic retention time, microscopic pka and logp
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10478446/
https://www.ncbi.nlm.nih.gov/pubmed/37670374
http://dx.doi.org/10.1186/s13321-023-00754-4
work_keys_str_mv AT wangyitian logd74predictionenhancedbytransferringknowledgefromchromatographicretentiontimemicroscopicpkaandlogp
AT xiongjiacheng logd74predictionenhancedbytransferringknowledgefromchromatographicretentiontimemicroscopicpkaandlogp
AT xiaofu logd74predictionenhancedbytransferringknowledgefromchromatographicretentiontimemicroscopicpkaandlogp
AT zhangwei logd74predictionenhancedbytransferringknowledgefromchromatographicretentiontimemicroscopicpkaandlogp
AT chengkaiyang logd74predictionenhancedbytransferringknowledgefromchromatographicretentiontimemicroscopicpkaandlogp
AT raojingxin logd74predictionenhancedbytransferringknowledgefromchromatographicretentiontimemicroscopicpkaandlogp
AT niubuying logd74predictionenhancedbytransferringknowledgefromchromatographicretentiontimemicroscopicpkaandlogp
AT tongxiaochu logd74predictionenhancedbytransferringknowledgefromchromatographicretentiontimemicroscopicpkaandlogp
AT quning logd74predictionenhancedbytransferringknowledgefromchromatographicretentiontimemicroscopicpkaandlogp
AT zhangrunze logd74predictionenhancedbytransferringknowledgefromchromatographicretentiontimemicroscopicpkaandlogp
AT wangdingyan logd74predictionenhancedbytransferringknowledgefromchromatographicretentiontimemicroscopicpkaandlogp
AT chenkaixian logd74predictionenhancedbytransferringknowledgefromchromatographicretentiontimemicroscopicpkaandlogp
AT lixutong logd74predictionenhancedbytransferringknowledgefromchromatographicretentiontimemicroscopicpkaandlogp
AT zhengmingyue logd74predictionenhancedbytransferringknowledgefromchromatographicretentiontimemicroscopicpkaandlogp