Cargando…

Dowker complex based machine learning (DCML) models for protein-ligand binding affinity prediction

With the great advancements in experimental data, computational power and learning algorithms, artificial intelligence (AI) based drug design has begun to gain momentum recently. AI-based drug design has great promise to revolutionize pharmaceutical industries by significantly reducing the time and...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Xiang, Feng, Huitao, Wu, Jie, Xia, Kelin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8985993/
https://www.ncbi.nlm.nih.gov/pubmed/35385478
http://dx.doi.org/10.1371/journal.pcbi.1009943
_version_ 1784682459802107904
author Liu, Xiang
Feng, Huitao
Wu, Jie
Xia, Kelin
author_facet Liu, Xiang
Feng, Huitao
Wu, Jie
Xia, Kelin
author_sort Liu, Xiang
collection PubMed
description With the great advancements in experimental data, computational power and learning algorithms, artificial intelligence (AI) based drug design has begun to gain momentum recently. AI-based drug design has great promise to revolutionize pharmaceutical industries by significantly reducing the time and cost in drug discovery processes. However, a major issue remains for all AI-based learning model that is efficient molecular representations. Here we propose Dowker complex (DC) based molecular interaction representations and Riemann Zeta function based molecular featurization, for the first time. Molecular interactions between proteins and ligands (or others) are modeled as Dowker complexes. A multiscale representation is generated by using a filtration process, during which a series of DCs are generated at different scales. Combinatorial (Hodge) Laplacian matrices are constructed from these DCs, and the Riemann zeta functions from their spectral information can be used as molecular descriptors. To validate our models, we consider protein-ligand binding affinity prediction. Our DC-based machine learning (DCML) models, in particular, DC-based gradient boosting tree (DC-GBT), are tested on three most-commonly used datasets, i.e., including PDBbind-2007, PDBbind-2013 and PDBbind-2016, and extensively compared with other existing state-of-the-art models. It has been found that our DC-based descriptors can achieve the state-of-the-art results and have better performance than all machine learning models with traditional molecular descriptors. Our Dowker complex based machine learning models can be used in other tasks in AI-based drug design and molecular data analysis.
format Online
Article
Text
id pubmed-8985993
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-89859932022-04-07 Dowker complex based machine learning (DCML) models for protein-ligand binding affinity prediction Liu, Xiang Feng, Huitao Wu, Jie Xia, Kelin PLoS Comput Biol Research Article With the great advancements in experimental data, computational power and learning algorithms, artificial intelligence (AI) based drug design has begun to gain momentum recently. AI-based drug design has great promise to revolutionize pharmaceutical industries by significantly reducing the time and cost in drug discovery processes. However, a major issue remains for all AI-based learning model that is efficient molecular representations. Here we propose Dowker complex (DC) based molecular interaction representations and Riemann Zeta function based molecular featurization, for the first time. Molecular interactions between proteins and ligands (or others) are modeled as Dowker complexes. A multiscale representation is generated by using a filtration process, during which a series of DCs are generated at different scales. Combinatorial (Hodge) Laplacian matrices are constructed from these DCs, and the Riemann zeta functions from their spectral information can be used as molecular descriptors. To validate our models, we consider protein-ligand binding affinity prediction. Our DC-based machine learning (DCML) models, in particular, DC-based gradient boosting tree (DC-GBT), are tested on three most-commonly used datasets, i.e., including PDBbind-2007, PDBbind-2013 and PDBbind-2016, and extensively compared with other existing state-of-the-art models. It has been found that our DC-based descriptors can achieve the state-of-the-art results and have better performance than all machine learning models with traditional molecular descriptors. Our Dowker complex based machine learning models can be used in other tasks in AI-based drug design and molecular data analysis. Public Library of Science 2022-04-06 /pmc/articles/PMC8985993/ /pubmed/35385478 http://dx.doi.org/10.1371/journal.pcbi.1009943 Text en © 2022 Liu et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Liu, Xiang
Feng, Huitao
Wu, Jie
Xia, Kelin
Dowker complex based machine learning (DCML) models for protein-ligand binding affinity prediction
title Dowker complex based machine learning (DCML) models for protein-ligand binding affinity prediction
title_full Dowker complex based machine learning (DCML) models for protein-ligand binding affinity prediction
title_fullStr Dowker complex based machine learning (DCML) models for protein-ligand binding affinity prediction
title_full_unstemmed Dowker complex based machine learning (DCML) models for protein-ligand binding affinity prediction
title_short Dowker complex based machine learning (DCML) models for protein-ligand binding affinity prediction
title_sort dowker complex based machine learning (dcml) models for protein-ligand binding affinity prediction
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8985993/
https://www.ncbi.nlm.nih.gov/pubmed/35385478
http://dx.doi.org/10.1371/journal.pcbi.1009943
work_keys_str_mv AT liuxiang dowkercomplexbasedmachinelearningdcmlmodelsforproteinligandbindingaffinityprediction
AT fenghuitao dowkercomplexbasedmachinelearningdcmlmodelsforproteinligandbindingaffinityprediction
AT wujie dowkercomplexbasedmachinelearningdcmlmodelsforproteinligandbindingaffinityprediction
AT xiakelin dowkercomplexbasedmachinelearningdcmlmodelsforproteinligandbindingaffinityprediction