Cargando…
Effects of multiple conformers per compound upon 3-D similarity search and bioassay data analysis
BACKGROUND: To improve the utility of PubChem, a public repository containing biological activities of small molecules, the PubChem3D project adds computationally-derived three-dimensional (3-D) descriptions to the small-molecule records contained in the PubChem Compound database and provides variou...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2012
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3537644/ https://www.ncbi.nlm.nih.gov/pubmed/23134593 http://dx.doi.org/10.1186/1758-2946-4-28 |
_version_ | 1782254888676229120 |
---|---|
author | Kim, Sunghwan Bolton, Evan E Bryant, Stephen H |
author_facet | Kim, Sunghwan Bolton, Evan E Bryant, Stephen H |
author_sort | Kim, Sunghwan |
collection | PubMed |
description | BACKGROUND: To improve the utility of PubChem, a public repository containing biological activities of small molecules, the PubChem3D project adds computationally-derived three-dimensional (3-D) descriptions to the small-molecule records contained in the PubChem Compound database and provides various search and analysis tools that exploit 3-D molecular similarity. Therefore, the efficient use of PubChem3D resources requires an understanding of the statistical and biological meaning of computed 3-D molecular similarity scores between molecules. RESULTS: The present study investigated effects of employing multiple conformers per compound upon the 3-D similarity scores between ten thousand randomly selected biologically-tested compounds (10-K set) and between non-inactive compounds in a given biological assay (156-K set). When the “best-conformer-pair” approach, in which a 3-D similarity score between two compounds is represented by the greatest similarity score among all possible conformer pairs arising from a compound pair, was employed with ten diverse conformers per compound, the average 3-D similarity scores for the 10-K set increased by 0.11, 0.09, 0.15, 0.16, 0.07, and 0.18 for ST(ST-opt), CT(ST-opt), ComboT(ST-opt), ST(CT-opt), CT(CT-opt), and ComboT(CT-opt), respectively, relative to the corresponding averages computed using a single conformer per compound. Interestingly, the best-conformer-pair approach also increased the average 3-D similarity scores for the non-inactive–non-inactive (NN) pairs for a given assay, by comparable amounts to those for the random compound pairs, although some assays showed a pronounced increase in the per-assay NN-pair 3-D similarity scores, compared to the average increase for the random compound pairs. CONCLUSION: These results suggest that the use of ten diverse conformers per compound in PubChem bioassay data analysis using 3-D molecular similarity is not expected to increase the separation of non-inactive from random and inactive spaces “on average”, although some assays show a noticeable separation between the non-inactive and random spaces when multiple conformers are used for each compound. The present study is a critical next step to understand effects of conformational diversity of the molecules upon the 3-D molecular similarity and its application to biological activity data analysis in PubChem. The results of this study may be helpful to build search and analysis tools that exploit 3-D molecular similarity between compounds archived in PubChem and other molecular libraries in a more efficient way. |
format | Online Article Text |
id | pubmed-3537644 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2012 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-35376442013-01-10 Effects of multiple conformers per compound upon 3-D similarity search and bioassay data analysis Kim, Sunghwan Bolton, Evan E Bryant, Stephen H J Cheminform Research Article BACKGROUND: To improve the utility of PubChem, a public repository containing biological activities of small molecules, the PubChem3D project adds computationally-derived three-dimensional (3-D) descriptions to the small-molecule records contained in the PubChem Compound database and provides various search and analysis tools that exploit 3-D molecular similarity. Therefore, the efficient use of PubChem3D resources requires an understanding of the statistical and biological meaning of computed 3-D molecular similarity scores between molecules. RESULTS: The present study investigated effects of employing multiple conformers per compound upon the 3-D similarity scores between ten thousand randomly selected biologically-tested compounds (10-K set) and between non-inactive compounds in a given biological assay (156-K set). When the “best-conformer-pair” approach, in which a 3-D similarity score between two compounds is represented by the greatest similarity score among all possible conformer pairs arising from a compound pair, was employed with ten diverse conformers per compound, the average 3-D similarity scores for the 10-K set increased by 0.11, 0.09, 0.15, 0.16, 0.07, and 0.18 for ST(ST-opt), CT(ST-opt), ComboT(ST-opt), ST(CT-opt), CT(CT-opt), and ComboT(CT-opt), respectively, relative to the corresponding averages computed using a single conformer per compound. Interestingly, the best-conformer-pair approach also increased the average 3-D similarity scores for the non-inactive–non-inactive (NN) pairs for a given assay, by comparable amounts to those for the random compound pairs, although some assays showed a pronounced increase in the per-assay NN-pair 3-D similarity scores, compared to the average increase for the random compound pairs. CONCLUSION: These results suggest that the use of ten diverse conformers per compound in PubChem bioassay data analysis using 3-D molecular similarity is not expected to increase the separation of non-inactive from random and inactive spaces “on average”, although some assays show a noticeable separation between the non-inactive and random spaces when multiple conformers are used for each compound. The present study is a critical next step to understand effects of conformational diversity of the molecules upon the 3-D molecular similarity and its application to biological activity data analysis in PubChem. The results of this study may be helpful to build search and analysis tools that exploit 3-D molecular similarity between compounds archived in PubChem and other molecular libraries in a more efficient way. BioMed Central 2012-11-07 /pmc/articles/PMC3537644/ /pubmed/23134593 http://dx.doi.org/10.1186/1758-2946-4-28 Text en Copyright ©2012 Kim et al.; licensee Chemistry Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Kim, Sunghwan Bolton, Evan E Bryant, Stephen H Effects of multiple conformers per compound upon 3-D similarity search and bioassay data analysis |
title | Effects of multiple conformers per compound upon 3-D similarity search and bioassay data analysis |
title_full | Effects of multiple conformers per compound upon 3-D similarity search and bioassay data analysis |
title_fullStr | Effects of multiple conformers per compound upon 3-D similarity search and bioassay data analysis |
title_full_unstemmed | Effects of multiple conformers per compound upon 3-D similarity search and bioassay data analysis |
title_short | Effects of multiple conformers per compound upon 3-D similarity search and bioassay data analysis |
title_sort | effects of multiple conformers per compound upon 3-d similarity search and bioassay data analysis |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3537644/ https://www.ncbi.nlm.nih.gov/pubmed/23134593 http://dx.doi.org/10.1186/1758-2946-4-28 |
work_keys_str_mv | AT kimsunghwan effectsofmultipleconformerspercompoundupon3dsimilaritysearchandbioassaydataanalysis AT boltonevane effectsofmultipleconformerspercompoundupon3dsimilaritysearchandbioassaydataanalysis AT bryantstephenh effectsofmultipleconformerspercompoundupon3dsimilaritysearchandbioassaydataanalysis |