Cargando…

Q-raKtion: A Semiautomated KNIME Workflow for Bioactivity Data Points Curation

[Image: see text] The recent increase of bioactivity data freely available to the scientific community and stored as activity data points in chemogenomic repositories provides a huge amount of ready-to-use information to support the development of predictive models. However, the benefits provided by...

Descripción completa

Detalles Bibliográficos
Autores principales: Palazzotti, Deborah, Fiorelli, Martina, Sabatini, Stefano, Massari, Serena, Barreca, Maria Letizia, Astolfi, Andrea
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Chemical Society 2022
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9795488/
https://www.ncbi.nlm.nih.gov/pubmed/36442071
http://dx.doi.org/10.1021/acs.jcim.2c01199
_version_ 1784860272390832128
author Palazzotti, Deborah
Fiorelli, Martina
Sabatini, Stefano
Massari, Serena
Barreca, Maria Letizia
Astolfi, Andrea
author_facet Palazzotti, Deborah
Fiorelli, Martina
Sabatini, Stefano
Massari, Serena
Barreca, Maria Letizia
Astolfi, Andrea
author_sort Palazzotti, Deborah
collection PubMed
description [Image: see text] The recent increase of bioactivity data freely available to the scientific community and stored as activity data points in chemogenomic repositories provides a huge amount of ready-to-use information to support the development of predictive models. However, the benefits provided by the availability of such a vast amount of accessible information are strongly counteracted by the lack of uniformity and consistency of data from multiple sources, requiring a process of integration and harmonization. While different automated pipelines for processing and assessing chemical data have emerged in the last years, the curation of bioactivity data points is a less investigated topic, with useful concepts provided but no tangible tools available. In this context, the present work represents a first step toward the filling of this gap, by providing a tool to meet the needs of end-user in building proprietary high-quality data sets for further studies. Specifically, we herein describe Q-raKtion, a systematic, semiautomated, flexible, and, above all, customizable KNIME workflow that effectively aggregates information on biological activities of compounds retrieved by two of the most comprehensive and widely used repositories, PubChem and ChEMBL.
format Online
Article
Text
id pubmed-9795488
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher American Chemical Society
record_format MEDLINE/PubMed
spelling pubmed-97954882022-12-29 Q-raKtion: A Semiautomated KNIME Workflow for Bioactivity Data Points Curation Palazzotti, Deborah Fiorelli, Martina Sabatini, Stefano Massari, Serena Barreca, Maria Letizia Astolfi, Andrea J Chem Inf Model [Image: see text] The recent increase of bioactivity data freely available to the scientific community and stored as activity data points in chemogenomic repositories provides a huge amount of ready-to-use information to support the development of predictive models. However, the benefits provided by the availability of such a vast amount of accessible information are strongly counteracted by the lack of uniformity and consistency of data from multiple sources, requiring a process of integration and harmonization. While different automated pipelines for processing and assessing chemical data have emerged in the last years, the curation of bioactivity data points is a less investigated topic, with useful concepts provided but no tangible tools available. In this context, the present work represents a first step toward the filling of this gap, by providing a tool to meet the needs of end-user in building proprietary high-quality data sets for further studies. Specifically, we herein describe Q-raKtion, a systematic, semiautomated, flexible, and, above all, customizable KNIME workflow that effectively aggregates information on biological activities of compounds retrieved by two of the most comprehensive and widely used repositories, PubChem and ChEMBL. American Chemical Society 2022-11-28 2022-12-26 /pmc/articles/PMC9795488/ /pubmed/36442071 http://dx.doi.org/10.1021/acs.jcim.2c01199 Text en © 2022 The Authors. Published by American Chemical Society https://creativecommons.org/licenses/by/4.0/Permits the broadest form of re-use including for commercial purposes, provided that author attribution and integrity are maintained (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Palazzotti, Deborah
Fiorelli, Martina
Sabatini, Stefano
Massari, Serena
Barreca, Maria Letizia
Astolfi, Andrea
Q-raKtion: A Semiautomated KNIME Workflow for Bioactivity Data Points Curation
title Q-raKtion: A Semiautomated KNIME Workflow for Bioactivity Data Points Curation
title_full Q-raKtion: A Semiautomated KNIME Workflow for Bioactivity Data Points Curation
title_fullStr Q-raKtion: A Semiautomated KNIME Workflow for Bioactivity Data Points Curation
title_full_unstemmed Q-raKtion: A Semiautomated KNIME Workflow for Bioactivity Data Points Curation
title_short Q-raKtion: A Semiautomated KNIME Workflow for Bioactivity Data Points Curation
title_sort q-raktion: a semiautomated knime workflow for bioactivity data points curation
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9795488/
https://www.ncbi.nlm.nih.gov/pubmed/36442071
http://dx.doi.org/10.1021/acs.jcim.2c01199
work_keys_str_mv AT palazzottideborah qraktionasemiautomatedknimeworkflowforbioactivitydatapointscuration
AT fiorellimartina qraktionasemiautomatedknimeworkflowforbioactivitydatapointscuration
AT sabatinistefano qraktionasemiautomatedknimeworkflowforbioactivitydatapointscuration
AT massariserena qraktionasemiautomatedknimeworkflowforbioactivitydatapointscuration
AT barrecamarialetizia qraktionasemiautomatedknimeworkflowforbioactivitydatapointscuration
AT astolfiandrea qraktionasemiautomatedknimeworkflowforbioactivitydatapointscuration