Cargando…

PreBINDS: An Interactive Web Tool to Create Appropriate Datasets for Predicting Compound–Protein Interactions

Given the abundant computational resources and the huge amount of data of compound–protein interactions (CPIs), constructing appropriate datasets for learning and evaluating prediction models for CPIs is not always easy. For this study, we have developed a web server to facilitate the development an...

Descripción completa

Detalles Bibliográficos
Autores principales: Ikeda, Kazuyoshi, Doi, Takuo, Ikeda, Masami, Tomii, Kentaro
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8685504/
https://www.ncbi.nlm.nih.gov/pubmed/34938773
http://dx.doi.org/10.3389/fmolb.2021.758480
_version_ 1784617846374924288
author Ikeda, Kazuyoshi
Doi, Takuo
Ikeda, Masami
Tomii, Kentaro
author_facet Ikeda, Kazuyoshi
Doi, Takuo
Ikeda, Masami
Tomii, Kentaro
author_sort Ikeda, Kazuyoshi
collection PubMed
description Given the abundant computational resources and the huge amount of data of compound–protein interactions (CPIs), constructing appropriate datasets for learning and evaluating prediction models for CPIs is not always easy. For this study, we have developed a web server to facilitate the development and evaluation of prediction models by providing an appropriate dataset according to the task. Our web server provides an environment and dataset that aid model developers and evaluators in obtaining a suitable dataset for both proteins and compounds, in addition to attributes necessary for deep learning. With the web server interface, users can customize the CPI dataset derived from ChEMBL by setting positive and negative thresholds to be adjusted according to the user’s definitions. We have also implemented a function for graphic display of the distribution of activity values in the dataset as a histogram to set appropriate thresholds for positive and negative examples. These functions enable effective development and evaluation of models. Furthermore, users can prepare their task-specific datasets by selecting a set of target proteins based on various criteria such as Pfam families, ChEMBL’s classification, and sequence similarities. The accuracy and efficiency of in silico screening and drug design using machine learning including deep learning can therefore be improved by facilitating access to an appropriate dataset prepared using our web server (https://binds.lifematics.work/).
format Online
Article
Text
id pubmed-8685504
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-86855042021-12-21 PreBINDS: An Interactive Web Tool to Create Appropriate Datasets for Predicting Compound–Protein Interactions Ikeda, Kazuyoshi Doi, Takuo Ikeda, Masami Tomii, Kentaro Front Mol Biosci Molecular Biosciences Given the abundant computational resources and the huge amount of data of compound–protein interactions (CPIs), constructing appropriate datasets for learning and evaluating prediction models for CPIs is not always easy. For this study, we have developed a web server to facilitate the development and evaluation of prediction models by providing an appropriate dataset according to the task. Our web server provides an environment and dataset that aid model developers and evaluators in obtaining a suitable dataset for both proteins and compounds, in addition to attributes necessary for deep learning. With the web server interface, users can customize the CPI dataset derived from ChEMBL by setting positive and negative thresholds to be adjusted according to the user’s definitions. We have also implemented a function for graphic display of the distribution of activity values in the dataset as a histogram to set appropriate thresholds for positive and negative examples. These functions enable effective development and evaluation of models. Furthermore, users can prepare their task-specific datasets by selecting a set of target proteins based on various criteria such as Pfam families, ChEMBL’s classification, and sequence similarities. The accuracy and efficiency of in silico screening and drug design using machine learning including deep learning can therefore be improved by facilitating access to an appropriate dataset prepared using our web server (https://binds.lifematics.work/). Frontiers Media S.A. 2021-12-06 /pmc/articles/PMC8685504/ /pubmed/34938773 http://dx.doi.org/10.3389/fmolb.2021.758480 Text en Copyright © 2021 Ikeda, Doi, Ikeda and Tomii. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Molecular Biosciences
Ikeda, Kazuyoshi
Doi, Takuo
Ikeda, Masami
Tomii, Kentaro
PreBINDS: An Interactive Web Tool to Create Appropriate Datasets for Predicting Compound–Protein Interactions
title PreBINDS: An Interactive Web Tool to Create Appropriate Datasets for Predicting Compound–Protein Interactions
title_full PreBINDS: An Interactive Web Tool to Create Appropriate Datasets for Predicting Compound–Protein Interactions
title_fullStr PreBINDS: An Interactive Web Tool to Create Appropriate Datasets for Predicting Compound–Protein Interactions
title_full_unstemmed PreBINDS: An Interactive Web Tool to Create Appropriate Datasets for Predicting Compound–Protein Interactions
title_short PreBINDS: An Interactive Web Tool to Create Appropriate Datasets for Predicting Compound–Protein Interactions
title_sort prebinds: an interactive web tool to create appropriate datasets for predicting compound–protein interactions
topic Molecular Biosciences
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8685504/
https://www.ncbi.nlm.nih.gov/pubmed/34938773
http://dx.doi.org/10.3389/fmolb.2021.758480
work_keys_str_mv AT ikedakazuyoshi prebindsaninteractivewebtooltocreateappropriatedatasetsforpredictingcompoundproteininteractions
AT doitakuo prebindsaninteractivewebtooltocreateappropriatedatasetsforpredictingcompoundproteininteractions
AT ikedamasami prebindsaninteractivewebtooltocreateappropriatedatasetsforpredictingcompoundproteininteractions
AT tomiikentaro prebindsaninteractivewebtooltocreateappropriatedatasetsforpredictingcompoundproteininteractions