Cargando…

Effects of multiple conformers per compound upon 3-D similarity search and bioassay data analysis

BACKGROUND: To improve the utility of PubChem, a public repository containing biological activities of small molecules, the PubChem3D project adds computationally-derived three-dimensional (3-D) descriptions to the small-molecule records contained in the PubChem Compound database and provides variou...

Descripción completa

Detalles Bibliográficos
Autores principales: Kim, Sunghwan, Bolton, Evan E, Bryant, Stephen H
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3537644/
https://www.ncbi.nlm.nih.gov/pubmed/23134593
http://dx.doi.org/10.1186/1758-2946-4-28
_version_ 1782254888676229120
author Kim, Sunghwan
Bolton, Evan E
Bryant, Stephen H
author_facet Kim, Sunghwan
Bolton, Evan E
Bryant, Stephen H
author_sort Kim, Sunghwan
collection PubMed
description BACKGROUND: To improve the utility of PubChem, a public repository containing biological activities of small molecules, the PubChem3D project adds computationally-derived three-dimensional (3-D) descriptions to the small-molecule records contained in the PubChem Compound database and provides various search and analysis tools that exploit 3-D molecular similarity. Therefore, the efficient use of PubChem3D resources requires an understanding of the statistical and biological meaning of computed 3-D molecular similarity scores between molecules. RESULTS: The present study investigated effects of employing multiple conformers per compound upon the 3-D similarity scores between ten thousand randomly selected biologically-tested compounds (10-K set) and between non-inactive compounds in a given biological assay (156-K set). When the “best-conformer-pair” approach, in which a 3-D similarity score between two compounds is represented by the greatest similarity score among all possible conformer pairs arising from a compound pair, was employed with ten diverse conformers per compound, the average 3-D similarity scores for the 10-K set increased by 0.11, 0.09, 0.15, 0.16, 0.07, and 0.18 for ST(ST-opt), CT(ST-opt), ComboT(ST-opt), ST(CT-opt), CT(CT-opt), and ComboT(CT-opt), respectively, relative to the corresponding averages computed using a single conformer per compound. Interestingly, the best-conformer-pair approach also increased the average 3-D similarity scores for the non-inactive–non-inactive (NN) pairs for a given assay, by comparable amounts to those for the random compound pairs, although some assays showed a pronounced increase in the per-assay NN-pair 3-D similarity scores, compared to the average increase for the random compound pairs. CONCLUSION: These results suggest that the use of ten diverse conformers per compound in PubChem bioassay data analysis using 3-D molecular similarity is not expected to increase the separation of non-inactive from random and inactive spaces “on average”, although some assays show a noticeable separation between the non-inactive and random spaces when multiple conformers are used for each compound. The present study is a critical next step to understand effects of conformational diversity of the molecules upon the 3-D molecular similarity and its application to biological activity data analysis in PubChem. The results of this study may be helpful to build search and analysis tools that exploit 3-D molecular similarity between compounds archived in PubChem and other molecular libraries in a more efficient way.
format Online
Article
Text
id pubmed-3537644
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-35376442013-01-10 Effects of multiple conformers per compound upon 3-D similarity search and bioassay data analysis Kim, Sunghwan Bolton, Evan E Bryant, Stephen H J Cheminform Research Article BACKGROUND: To improve the utility of PubChem, a public repository containing biological activities of small molecules, the PubChem3D project adds computationally-derived three-dimensional (3-D) descriptions to the small-molecule records contained in the PubChem Compound database and provides various search and analysis tools that exploit 3-D molecular similarity. Therefore, the efficient use of PubChem3D resources requires an understanding of the statistical and biological meaning of computed 3-D molecular similarity scores between molecules. RESULTS: The present study investigated effects of employing multiple conformers per compound upon the 3-D similarity scores between ten thousand randomly selected biologically-tested compounds (10-K set) and between non-inactive compounds in a given biological assay (156-K set). When the “best-conformer-pair” approach, in which a 3-D similarity score between two compounds is represented by the greatest similarity score among all possible conformer pairs arising from a compound pair, was employed with ten diverse conformers per compound, the average 3-D similarity scores for the 10-K set increased by 0.11, 0.09, 0.15, 0.16, 0.07, and 0.18 for ST(ST-opt), CT(ST-opt), ComboT(ST-opt), ST(CT-opt), CT(CT-opt), and ComboT(CT-opt), respectively, relative to the corresponding averages computed using a single conformer per compound. Interestingly, the best-conformer-pair approach also increased the average 3-D similarity scores for the non-inactive–non-inactive (NN) pairs for a given assay, by comparable amounts to those for the random compound pairs, although some assays showed a pronounced increase in the per-assay NN-pair 3-D similarity scores, compared to the average increase for the random compound pairs. CONCLUSION: These results suggest that the use of ten diverse conformers per compound in PubChem bioassay data analysis using 3-D molecular similarity is not expected to increase the separation of non-inactive from random and inactive spaces “on average”, although some assays show a noticeable separation between the non-inactive and random spaces when multiple conformers are used for each compound. The present study is a critical next step to understand effects of conformational diversity of the molecules upon the 3-D molecular similarity and its application to biological activity data analysis in PubChem. The results of this study may be helpful to build search and analysis tools that exploit 3-D molecular similarity between compounds archived in PubChem and other molecular libraries in a more efficient way. BioMed Central 2012-11-07 /pmc/articles/PMC3537644/ /pubmed/23134593 http://dx.doi.org/10.1186/1758-2946-4-28 Text en Copyright ©2012 Kim et al.; licensee Chemistry Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Kim, Sunghwan
Bolton, Evan E
Bryant, Stephen H
Effects of multiple conformers per compound upon 3-D similarity search and bioassay data analysis
title Effects of multiple conformers per compound upon 3-D similarity search and bioassay data analysis
title_full Effects of multiple conformers per compound upon 3-D similarity search and bioassay data analysis
title_fullStr Effects of multiple conformers per compound upon 3-D similarity search and bioassay data analysis
title_full_unstemmed Effects of multiple conformers per compound upon 3-D similarity search and bioassay data analysis
title_short Effects of multiple conformers per compound upon 3-D similarity search and bioassay data analysis
title_sort effects of multiple conformers per compound upon 3-d similarity search and bioassay data analysis
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3537644/
https://www.ncbi.nlm.nih.gov/pubmed/23134593
http://dx.doi.org/10.1186/1758-2946-4-28
work_keys_str_mv AT kimsunghwan effectsofmultipleconformerspercompoundupon3dsimilaritysearchandbioassaydataanalysis
AT boltonevane effectsofmultipleconformerspercompoundupon3dsimilaritysearchandbioassaydataanalysis
AT bryantstephenh effectsofmultipleconformerspercompoundupon3dsimilaritysearchandbioassaydataanalysis