Cargando…

The rcdk and cluster R packages applied to drug candidate selection

The aim of this article is to show how thevpower of statistics and cheminformatics can be combined, in R, using two packages: rcdk and cluster. We describe the role of clustering methods for identifying similar structures in a group of 23 molecules according to their fingerprints. The most commonly...

Descripción completa

Detalles Bibliográficos
Autores principales: Voicu, Adrian, Duteanu, Narcis, Voicu, Mirela, Vlad, Daliborca, Dumitrascu, Victor
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer International Publishing 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6970292/
https://www.ncbi.nlm.nih.gov/pubmed/33430987
http://dx.doi.org/10.1186/s13321-019-0405-0
_version_ 1783489490661146624
author Voicu, Adrian
Duteanu, Narcis
Voicu, Mirela
Vlad, Daliborca
Dumitrascu, Victor
author_facet Voicu, Adrian
Duteanu, Narcis
Voicu, Mirela
Vlad, Daliborca
Dumitrascu, Victor
author_sort Voicu, Adrian
collection PubMed
description The aim of this article is to show how thevpower of statistics and cheminformatics can be combined, in R, using two packages: rcdk and cluster. We describe the role of clustering methods for identifying similar structures in a group of 23 molecules according to their fingerprints. The most commonly used method is to group the molecules using a “score” obtained by measuring the average distance between them. This score reflects the similarity/non-similarity between compounds and helps us identify active or potentially toxic substances through predictive studies. Clustering is the process by which the common characteristics of a particular class of compounds are identified. For clustering applications, we are generally measure the molecular fingerprint similarity with the Tanimoto coefficient. Based on the molecular fingerprints, we calculated the molecular distances between the methotrexate molecule and the other 23 molecules in the group, and organized them into a matrix. According to the molecular distances and Ward ’s method, the molecules were grouped into 3 clusters. We can presume structural similarity between the compounds and their locations in the cluster map. Because only 5 molecules were included in the methotrexate cluster, we considered that they might have similar properties and might be further tested as potential drug candidates.
format Online
Article
Text
id pubmed-6970292
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Springer International Publishing
record_format MEDLINE/PubMed
spelling pubmed-69702922020-01-27 The rcdk and cluster R packages applied to drug candidate selection Voicu, Adrian Duteanu, Narcis Voicu, Mirela Vlad, Daliborca Dumitrascu, Victor J Cheminform Educational The aim of this article is to show how thevpower of statistics and cheminformatics can be combined, in R, using two packages: rcdk and cluster. We describe the role of clustering methods for identifying similar structures in a group of 23 molecules according to their fingerprints. The most commonly used method is to group the molecules using a “score” obtained by measuring the average distance between them. This score reflects the similarity/non-similarity between compounds and helps us identify active or potentially toxic substances through predictive studies. Clustering is the process by which the common characteristics of a particular class of compounds are identified. For clustering applications, we are generally measure the molecular fingerprint similarity with the Tanimoto coefficient. Based on the molecular fingerprints, we calculated the molecular distances between the methotrexate molecule and the other 23 molecules in the group, and organized them into a matrix. According to the molecular distances and Ward ’s method, the molecules were grouped into 3 clusters. We can presume structural similarity between the compounds and their locations in the cluster map. Because only 5 molecules were included in the methotrexate cluster, we considered that they might have similar properties and might be further tested as potential drug candidates. Springer International Publishing 2020-01-20 /pmc/articles/PMC6970292/ /pubmed/33430987 http://dx.doi.org/10.1186/s13321-019-0405-0 Text en © The Author(s) 2020 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Educational
Voicu, Adrian
Duteanu, Narcis
Voicu, Mirela
Vlad, Daliborca
Dumitrascu, Victor
The rcdk and cluster R packages applied to drug candidate selection
title The rcdk and cluster R packages applied to drug candidate selection
title_full The rcdk and cluster R packages applied to drug candidate selection
title_fullStr The rcdk and cluster R packages applied to drug candidate selection
title_full_unstemmed The rcdk and cluster R packages applied to drug candidate selection
title_short The rcdk and cluster R packages applied to drug candidate selection
title_sort rcdk and cluster r packages applied to drug candidate selection
topic Educational
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6970292/
https://www.ncbi.nlm.nih.gov/pubmed/33430987
http://dx.doi.org/10.1186/s13321-019-0405-0
work_keys_str_mv AT voicuadrian thercdkandclusterrpackagesappliedtodrugcandidateselection
AT duteanunarcis thercdkandclusterrpackagesappliedtodrugcandidateselection
AT voicumirela thercdkandclusterrpackagesappliedtodrugcandidateselection
AT vladdaliborca thercdkandclusterrpackagesappliedtodrugcandidateselection
AT dumitrascuvictor thercdkandclusterrpackagesappliedtodrugcandidateselection
AT voicuadrian rcdkandclusterrpackagesappliedtodrugcandidateselection
AT duteanunarcis rcdkandclusterrpackagesappliedtodrugcandidateselection
AT voicumirela rcdkandclusterrpackagesappliedtodrugcandidateselection
AT vladdaliborca rcdkandclusterrpackagesappliedtodrugcandidateselection
AT dumitrascuvictor rcdkandclusterrpackagesappliedtodrugcandidateselection