Cargando…

A Consensus Compound/Bioactivity Dataset for Data-Driven Drug Design and Chemogenomics

Publicly available compound and bioactivity databases provide an essential basis for data-driven applications in life-science research and drug design. By analyzing several bioactivity repositories, we discovered differences in compound and target coverage advocating the combined use of data from mu...

Descripción completa

Detalles Bibliográficos
Autores principales: Isigkeit, Laura, Chaikuad, Apirat, Merk, Daniel
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9028877/
https://www.ncbi.nlm.nih.gov/pubmed/35458710
http://dx.doi.org/10.3390/molecules27082513
_version_ 1784691735530569728
author Isigkeit, Laura
Chaikuad, Apirat
Merk, Daniel
author_facet Isigkeit, Laura
Chaikuad, Apirat
Merk, Daniel
author_sort Isigkeit, Laura
collection PubMed
description Publicly available compound and bioactivity databases provide an essential basis for data-driven applications in life-science research and drug design. By analyzing several bioactivity repositories, we discovered differences in compound and target coverage advocating the combined use of data from multiple sources. Using data from ChEMBL, PubChem, IUPHAR/BPS, BindingDB, and Probes & Drugs, we assembled a consensus dataset focusing on small molecules with bioactivity on human macromolecular targets. This allowed an improved coverage of compound space and targets, and an automated comparison and curation of structural and bioactivity data to reveal potentially erroneous entries and increase confidence. The consensus dataset comprised of more than 1.1 million compounds with over 10.9 million bioactivity data points with annotations on assay type and bioactivity confidence, providing a useful ensemble for computational applications in drug design and chemogenomics.
format Online
Article
Text
id pubmed-9028877
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-90288772022-04-23 A Consensus Compound/Bioactivity Dataset for Data-Driven Drug Design and Chemogenomics Isigkeit, Laura Chaikuad, Apirat Merk, Daniel Molecules Article Publicly available compound and bioactivity databases provide an essential basis for data-driven applications in life-science research and drug design. By analyzing several bioactivity repositories, we discovered differences in compound and target coverage advocating the combined use of data from multiple sources. Using data from ChEMBL, PubChem, IUPHAR/BPS, BindingDB, and Probes & Drugs, we assembled a consensus dataset focusing on small molecules with bioactivity on human macromolecular targets. This allowed an improved coverage of compound space and targets, and an automated comparison and curation of structural and bioactivity data to reveal potentially erroneous entries and increase confidence. The consensus dataset comprised of more than 1.1 million compounds with over 10.9 million bioactivity data points with annotations on assay type and bioactivity confidence, providing a useful ensemble for computational applications in drug design and chemogenomics. MDPI 2022-04-13 /pmc/articles/PMC9028877/ /pubmed/35458710 http://dx.doi.org/10.3390/molecules27082513 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Isigkeit, Laura
Chaikuad, Apirat
Merk, Daniel
A Consensus Compound/Bioactivity Dataset for Data-Driven Drug Design and Chemogenomics
title A Consensus Compound/Bioactivity Dataset for Data-Driven Drug Design and Chemogenomics
title_full A Consensus Compound/Bioactivity Dataset for Data-Driven Drug Design and Chemogenomics
title_fullStr A Consensus Compound/Bioactivity Dataset for Data-Driven Drug Design and Chemogenomics
title_full_unstemmed A Consensus Compound/Bioactivity Dataset for Data-Driven Drug Design and Chemogenomics
title_short A Consensus Compound/Bioactivity Dataset for Data-Driven Drug Design and Chemogenomics
title_sort consensus compound/bioactivity dataset for data-driven drug design and chemogenomics
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9028877/
https://www.ncbi.nlm.nih.gov/pubmed/35458710
http://dx.doi.org/10.3390/molecules27082513
work_keys_str_mv AT isigkeitlaura aconsensuscompoundbioactivitydatasetfordatadrivendrugdesignandchemogenomics
AT chaikuadapirat aconsensuscompoundbioactivitydatasetfordatadrivendrugdesignandchemogenomics
AT merkdaniel aconsensuscompoundbioactivitydatasetfordatadrivendrugdesignandchemogenomics
AT isigkeitlaura consensuscompoundbioactivitydatasetfordatadrivendrugdesignandchemogenomics
AT chaikuadapirat consensuscompoundbioactivitydatasetfordatadrivendrugdesignandchemogenomics
AT merkdaniel consensuscompoundbioactivitydatasetfordatadrivendrugdesignandchemogenomics