Cargando…

Selecting Diversified Compounds to Build a Tangible Library for Biological and Biochemical Assays

The quality of diverse compound selection mainly depends on cluster algorithms, descriptors, the combinations of the descriptors, and similarity metrics. The Jarvis-Patrick algorithm, MDL search keys, and Daylight fingerprints are a well accepted algorithm and structure descriptors for compound libr...

Descripción completa

Detalles Bibliográficos
Autores principales: Gu, Qiong, Xu, Jun, Gu, Lianquan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6257665/
https://www.ncbi.nlm.nih.gov/pubmed/20657406
http://dx.doi.org/10.3390/molecules15075031
Descripción
Sumario:The quality of diverse compound selection mainly depends on cluster algorithms, descriptors, the combinations of the descriptors, and similarity metrics. The Jarvis-Patrick algorithm, MDL search keys, and Daylight fingerprints are a well accepted algorithm and structure descriptors for compound library diversity analysis. Based upon our 288 experiments on selecting compounds from various descriptor combinations, we have found (1) hybrid Daylight and MDL structural descriptors for diversity analyses can produce worse results; (2) selections based purely on 2,048-bit Daylight fingerprints yield better results than the ones based purely on MDL 166-bit search keys; (3) when Daylight fingerprints and MDL search keys are combined, it is better to compute the similarities independently, then to take the smaller value for the outcome. This will yield better average separation of clusters; (4) regarding the consistency of different clustering approaches, the Daylight fingerprints based clustering is more consistent with the SCA approach than it does with the MDL search keys based approach; (5) The MDL search keys based selection approach tends to select a greater number of compounds from larger clusters. As the Daylight fingerprint is folded two and three times, respectively, information is lost, and this approach tends to select a greater number of compounds from larger clusters as well. These results have not been reported before to our knowledge.