Cargando…
A novel descriptor based on atom-pair properties
BACKGROUND: Molecular descriptors have been widely used to predict biological activities and physicochemical properties or to analyze chemical libraries on the basis of similarity. Although fingerprints and properties are generally used as descriptors, neither is perfect for these purposes. A finger...
Autor principal: | |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Springer International Publishing
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5270600/ https://www.ncbi.nlm.nih.gov/pubmed/28316652 http://dx.doi.org/10.1186/s13321-016-0187-6 |
_version_ | 1782501200242933760 |
---|---|
author | Kuroda, Masataka |
author_facet | Kuroda, Masataka |
author_sort | Kuroda, Masataka |
collection | PubMed |
description | BACKGROUND: Molecular descriptors have been widely used to predict biological activities and physicochemical properties or to analyze chemical libraries on the basis of similarity. Although fingerprints and properties are generally used as descriptors, neither is perfect for these purposes. A fingerprint can distinguish between molecules, whereas a property may not do the same in certain cases, and vice versa. When the number of the training set is especially small, the construction of good predictive models is difficult. Herein, a novel descriptor integrating mutually compensating fingerprint and property characteristics is described. The format of this descriptor is not conventional. It has two dimensions with variable length in one dimension to represent one molecule. This format is not acceptable for any machine learning methods. Therefore the distance between molecules has been newly defined for application to machine learning techniques. The evaluation of this descriptor, as applied to classification tasks, was performed using a support vector machine after the features of the descriptor had been optimized by a genetic algorithm. RESULTS: Because the optimizing feature is time-intensive due to the complicated calculation of distances between molecules, the optimization was forced to stop before it was completed. As a result, no remarkable improvement was observed in the classification results for the new descriptor compared with those for other descriptors in any evaluation set used in this work. However, extremely low accuracies were also not found for any set. CONCLUSIONS: The novel descriptor proposed in this work can potentially be used to make highly accurate predictive models. This new concept in descriptors is expected to be useful for developing novel predictive methods with quick training and high accuracy. |
format | Online Article Text |
id | pubmed-5270600 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | Springer International Publishing |
record_format | MEDLINE/PubMed |
spelling | pubmed-52706002017-03-17 A novel descriptor based on atom-pair properties Kuroda, Masataka J Cheminform Research Article BACKGROUND: Molecular descriptors have been widely used to predict biological activities and physicochemical properties or to analyze chemical libraries on the basis of similarity. Although fingerprints and properties are generally used as descriptors, neither is perfect for these purposes. A fingerprint can distinguish between molecules, whereas a property may not do the same in certain cases, and vice versa. When the number of the training set is especially small, the construction of good predictive models is difficult. Herein, a novel descriptor integrating mutually compensating fingerprint and property characteristics is described. The format of this descriptor is not conventional. It has two dimensions with variable length in one dimension to represent one molecule. This format is not acceptable for any machine learning methods. Therefore the distance between molecules has been newly defined for application to machine learning techniques. The evaluation of this descriptor, as applied to classification tasks, was performed using a support vector machine after the features of the descriptor had been optimized by a genetic algorithm. RESULTS: Because the optimizing feature is time-intensive due to the complicated calculation of distances between molecules, the optimization was forced to stop before it was completed. As a result, no remarkable improvement was observed in the classification results for the new descriptor compared with those for other descriptors in any evaluation set used in this work. However, extremely low accuracies were also not found for any set. CONCLUSIONS: The novel descriptor proposed in this work can potentially be used to make highly accurate predictive models. This new concept in descriptors is expected to be useful for developing novel predictive methods with quick training and high accuracy. Springer International Publishing 2017-01-05 /pmc/articles/PMC5270600/ /pubmed/28316652 http://dx.doi.org/10.1186/s13321-016-0187-6 Text en © The Author(s) 2017 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Article Kuroda, Masataka A novel descriptor based on atom-pair properties |
title | A novel descriptor based on atom-pair properties |
title_full | A novel descriptor based on atom-pair properties |
title_fullStr | A novel descriptor based on atom-pair properties |
title_full_unstemmed | A novel descriptor based on atom-pair properties |
title_short | A novel descriptor based on atom-pair properties |
title_sort | novel descriptor based on atom-pair properties |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5270600/ https://www.ncbi.nlm.nih.gov/pubmed/28316652 http://dx.doi.org/10.1186/s13321-016-0187-6 |
work_keys_str_mv | AT kurodamasataka anoveldescriptorbasedonatompairproperties AT kurodamasataka noveldescriptorbasedonatompairproperties |