Cargando…

NP-Scout: Machine Learning Approach for the Quantification and Visualization of the Natural Product-Likeness of Small Molecules

Natural products (NPs) remain the most prolific resource for the development of small-molecule drugs. Here we report a new machine learning approach that allows the identification of natural products with high accuracy. The method also generates similarity maps, which highlight atoms that contribute...

Descripción completa

Detalles Bibliográficos
Autores principales: Chen, Ya, Stork, Conrad, Hirte, Steffen, Kirchmair, Johannes
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6406893/
https://www.ncbi.nlm.nih.gov/pubmed/30682850
http://dx.doi.org/10.3390/biom9020043
_version_ 1783401429173534720
author Chen, Ya
Stork, Conrad
Hirte, Steffen
Kirchmair, Johannes
author_facet Chen, Ya
Stork, Conrad
Hirte, Steffen
Kirchmair, Johannes
author_sort Chen, Ya
collection PubMed
description Natural products (NPs) remain the most prolific resource for the development of small-molecule drugs. Here we report a new machine learning approach that allows the identification of natural products with high accuracy. The method also generates similarity maps, which highlight atoms that contribute significantly to the classification of small molecules as a natural product or synthetic molecule. The method can hence be utilized to (i) identify natural products in large molecular libraries, (ii) quantify the natural product-likeness of small molecules, and (iii) visualize atoms in small molecules that are characteristic of natural products or synthetic molecules. The models are based on random forest classifiers trained on data sets consisting of more than 265,000 to 322,000 natural products and synthetic molecules. Two-dimensional molecular descriptors, MACCS keys and Morgan2 fingerprints were explored. On an independent test set the models reached areas under the receiver operating characteristic curve (AUC) of 0.997 and Matthews correlation coefficients (MCCs) of 0.954 and higher. The method was further tested on data from the Dictionary of Natural Products, ChEMBL and other resources. The best-performing models are accessible as a free web service at http://npscout.zbh.uni-hamburg.de/npscout.
format Online
Article
Text
id pubmed-6406893
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-64068932019-03-13 NP-Scout: Machine Learning Approach for the Quantification and Visualization of the Natural Product-Likeness of Small Molecules Chen, Ya Stork, Conrad Hirte, Steffen Kirchmair, Johannes Biomolecules Article Natural products (NPs) remain the most prolific resource for the development of small-molecule drugs. Here we report a new machine learning approach that allows the identification of natural products with high accuracy. The method also generates similarity maps, which highlight atoms that contribute significantly to the classification of small molecules as a natural product or synthetic molecule. The method can hence be utilized to (i) identify natural products in large molecular libraries, (ii) quantify the natural product-likeness of small molecules, and (iii) visualize atoms in small molecules that are characteristic of natural products or synthetic molecules. The models are based on random forest classifiers trained on data sets consisting of more than 265,000 to 322,000 natural products and synthetic molecules. Two-dimensional molecular descriptors, MACCS keys and Morgan2 fingerprints were explored. On an independent test set the models reached areas under the receiver operating characteristic curve (AUC) of 0.997 and Matthews correlation coefficients (MCCs) of 0.954 and higher. The method was further tested on data from the Dictionary of Natural Products, ChEMBL and other resources. The best-performing models are accessible as a free web service at http://npscout.zbh.uni-hamburg.de/npscout. MDPI 2019-01-24 /pmc/articles/PMC6406893/ /pubmed/30682850 http://dx.doi.org/10.3390/biom9020043 Text en © 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Chen, Ya
Stork, Conrad
Hirte, Steffen
Kirchmair, Johannes
NP-Scout: Machine Learning Approach for the Quantification and Visualization of the Natural Product-Likeness of Small Molecules
title NP-Scout: Machine Learning Approach for the Quantification and Visualization of the Natural Product-Likeness of Small Molecules
title_full NP-Scout: Machine Learning Approach for the Quantification and Visualization of the Natural Product-Likeness of Small Molecules
title_fullStr NP-Scout: Machine Learning Approach for the Quantification and Visualization of the Natural Product-Likeness of Small Molecules
title_full_unstemmed NP-Scout: Machine Learning Approach for the Quantification and Visualization of the Natural Product-Likeness of Small Molecules
title_short NP-Scout: Machine Learning Approach for the Quantification and Visualization of the Natural Product-Likeness of Small Molecules
title_sort np-scout: machine learning approach for the quantification and visualization of the natural product-likeness of small molecules
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6406893/
https://www.ncbi.nlm.nih.gov/pubmed/30682850
http://dx.doi.org/10.3390/biom9020043
work_keys_str_mv AT chenya npscoutmachinelearningapproachforthequantificationandvisualizationofthenaturalproductlikenessofsmallmolecules
AT storkconrad npscoutmachinelearningapproachforthequantificationandvisualizationofthenaturalproductlikenessofsmallmolecules
AT hirtesteffen npscoutmachinelearningapproachforthequantificationandvisualizationofthenaturalproductlikenessofsmallmolecules
AT kirchmairjohannes npscoutmachinelearningapproachforthequantificationandvisualizationofthenaturalproductlikenessofsmallmolecules