Cargando…

A novel descriptor based on atom-pair properties

BACKGROUND: Molecular descriptors have been widely used to predict biological activities and physicochemical properties or to analyze chemical libraries on the basis of similarity. Although fingerprints and properties are generally used as descriptors, neither is perfect for these purposes. A finger...

Descripción completa

Detalles Bibliográficos
Autor principal: Kuroda, Masataka
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer International Publishing 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5270600/
https://www.ncbi.nlm.nih.gov/pubmed/28316652
http://dx.doi.org/10.1186/s13321-016-0187-6
_version_ 1782501200242933760
author Kuroda, Masataka
author_facet Kuroda, Masataka
author_sort Kuroda, Masataka
collection PubMed
description BACKGROUND: Molecular descriptors have been widely used to predict biological activities and physicochemical properties or to analyze chemical libraries on the basis of similarity. Although fingerprints and properties are generally used as descriptors, neither is perfect for these purposes. A fingerprint can distinguish between molecules, whereas a property may not do the same in certain cases, and vice versa. When the number of the training set is especially small, the construction of good predictive models is difficult. Herein, a novel descriptor integrating mutually compensating fingerprint and property characteristics is described. The format of this descriptor is not conventional. It has two dimensions with variable length in one dimension to represent one molecule. This format is not acceptable for any machine learning methods. Therefore the distance between molecules has been newly defined for application to machine learning techniques. The evaluation of this descriptor, as applied to classification tasks, was performed using a support vector machine after the features of the descriptor had been optimized by a genetic algorithm. RESULTS: Because the optimizing feature is time-intensive due to the complicated calculation of distances between molecules, the optimization was forced to stop before it was completed. As a result, no remarkable improvement was observed in the classification results for the new descriptor compared with those for other descriptors in any evaluation set used in this work. However, extremely low accuracies were also not found for any set. CONCLUSIONS: The novel descriptor proposed in this work can potentially be used to make highly accurate predictive models. This new concept in descriptors is expected to be useful for developing novel predictive methods with quick training and high accuracy.
format Online
Article
Text
id pubmed-5270600
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Springer International Publishing
record_format MEDLINE/PubMed
spelling pubmed-52706002017-03-17 A novel descriptor based on atom-pair properties Kuroda, Masataka J Cheminform Research Article BACKGROUND: Molecular descriptors have been widely used to predict biological activities and physicochemical properties or to analyze chemical libraries on the basis of similarity. Although fingerprints and properties are generally used as descriptors, neither is perfect for these purposes. A fingerprint can distinguish between molecules, whereas a property may not do the same in certain cases, and vice versa. When the number of the training set is especially small, the construction of good predictive models is difficult. Herein, a novel descriptor integrating mutually compensating fingerprint and property characteristics is described. The format of this descriptor is not conventional. It has two dimensions with variable length in one dimension to represent one molecule. This format is not acceptable for any machine learning methods. Therefore the distance between molecules has been newly defined for application to machine learning techniques. The evaluation of this descriptor, as applied to classification tasks, was performed using a support vector machine after the features of the descriptor had been optimized by a genetic algorithm. RESULTS: Because the optimizing feature is time-intensive due to the complicated calculation of distances between molecules, the optimization was forced to stop before it was completed. As a result, no remarkable improvement was observed in the classification results for the new descriptor compared with those for other descriptors in any evaluation set used in this work. However, extremely low accuracies were also not found for any set. CONCLUSIONS: The novel descriptor proposed in this work can potentially be used to make highly accurate predictive models. This new concept in descriptors is expected to be useful for developing novel predictive methods with quick training and high accuracy. Springer International Publishing 2017-01-05 /pmc/articles/PMC5270600/ /pubmed/28316652 http://dx.doi.org/10.1186/s13321-016-0187-6 Text en © The Author(s) 2017 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Kuroda, Masataka
A novel descriptor based on atom-pair properties
title A novel descriptor based on atom-pair properties
title_full A novel descriptor based on atom-pair properties
title_fullStr A novel descriptor based on atom-pair properties
title_full_unstemmed A novel descriptor based on atom-pair properties
title_short A novel descriptor based on atom-pair properties
title_sort novel descriptor based on atom-pair properties
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5270600/
https://www.ncbi.nlm.nih.gov/pubmed/28316652
http://dx.doi.org/10.1186/s13321-016-0187-6
work_keys_str_mv AT kurodamasataka anoveldescriptorbasedonatompairproperties
AT kurodamasataka noveldescriptorbasedonatompairproperties