Cargando…

Target enhanced 2D similarity search by using explicit biological activity annotations and profiles

BACKGROUND: The enriched biological activity information of compounds in large and freely-accessible chemical databases like the PubChem Bioassay Database has become a powerful research resource for the scientific research community. Currently, 2D fingerprint based conventional similarity search (CS...

Descripción completa

Detalles Bibliográficos
Autores principales: Yu, Xiang, Geer, Lewis Y., Han, Lianyi, Bryant, Stephen H.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer International Publishing 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4648974/
https://www.ncbi.nlm.nih.gov/pubmed/26583046
http://dx.doi.org/10.1186/s13321-015-0103-5
_version_ 1782401288855617536
author Yu, Xiang
Geer, Lewis Y.
Han, Lianyi
Bryant, Stephen H.
author_facet Yu, Xiang
Geer, Lewis Y.
Han, Lianyi
Bryant, Stephen H.
author_sort Yu, Xiang
collection PubMed
description BACKGROUND: The enriched biological activity information of compounds in large and freely-accessible chemical databases like the PubChem Bioassay Database has become a powerful research resource for the scientific research community. Currently, 2D fingerprint based conventional similarity search (CSS) is the most common widely used approach for database screening, but it does not typically incorporate the relative importance of fingerprint bits to biological activity. RESULTS: In this study, a large-scale similarity search investigation has been carried out on 208 well-defined compound activity classes extracted from PubChem Bioassay Database. An analysis was performed to compare the search performance of three types of 2D similarity search approaches: 2D fingerprint based conventional similarity search approach (CSS), iterative similarity search approach with multiple active compounds as references (ISS), and fingerprint based iterative similarity search with classification (ISC), which can be regarded as the combination of iterative similarity search with active references and a reversed iterative similarity search with inactive references. Compared to the search results returned by CSS, ISS improves recall but not precision. Although ISC causes the false rejection of active hits, it improves the precision with statistical significance, and outperforms both ISS and CSS. In a second part of this study, we introduce the profile concept into the three types of searches. We find that the profile based non-iterative search can significantly improve the search performance by increasing the recall rate. We also find that profile based ISS (PBISS) and profile based ISC (PBISC) significantly decreases ISS search time without sacrificing search performance. CONCLUSIONS: On the basis of our large-scale investigation directed against a wide spectrum of pharmaceutical targets, we conclude that ISC and ISS searches perform better than 2D fingerprint similarity searching and that profile based versions of these algorithms do nearly as well in less time. We also suggest that the profile version of the iterative similarity searches are both better performing and potentially quicker than the standard algorithm. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13321-015-0103-5) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4648974
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Springer International Publishing
record_format MEDLINE/PubMed
spelling pubmed-46489742015-11-19 Target enhanced 2D similarity search by using explicit biological activity annotations and profiles Yu, Xiang Geer, Lewis Y. Han, Lianyi Bryant, Stephen H. J Cheminform Research Article BACKGROUND: The enriched biological activity information of compounds in large and freely-accessible chemical databases like the PubChem Bioassay Database has become a powerful research resource for the scientific research community. Currently, 2D fingerprint based conventional similarity search (CSS) is the most common widely used approach for database screening, but it does not typically incorporate the relative importance of fingerprint bits to biological activity. RESULTS: In this study, a large-scale similarity search investigation has been carried out on 208 well-defined compound activity classes extracted from PubChem Bioassay Database. An analysis was performed to compare the search performance of three types of 2D similarity search approaches: 2D fingerprint based conventional similarity search approach (CSS), iterative similarity search approach with multiple active compounds as references (ISS), and fingerprint based iterative similarity search with classification (ISC), which can be regarded as the combination of iterative similarity search with active references and a reversed iterative similarity search with inactive references. Compared to the search results returned by CSS, ISS improves recall but not precision. Although ISC causes the false rejection of active hits, it improves the precision with statistical significance, and outperforms both ISS and CSS. In a second part of this study, we introduce the profile concept into the three types of searches. We find that the profile based non-iterative search can significantly improve the search performance by increasing the recall rate. We also find that profile based ISS (PBISS) and profile based ISC (PBISC) significantly decreases ISS search time without sacrificing search performance. CONCLUSIONS: On the basis of our large-scale investigation directed against a wide spectrum of pharmaceutical targets, we conclude that ISC and ISS searches perform better than 2D fingerprint similarity searching and that profile based versions of these algorithms do nearly as well in less time. We also suggest that the profile version of the iterative similarity searches are both better performing and potentially quicker than the standard algorithm. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13321-015-0103-5) contains supplementary material, which is available to authorized users. Springer International Publishing 2015-11-17 /pmc/articles/PMC4648974/ /pubmed/26583046 http://dx.doi.org/10.1186/s13321-015-0103-5 Text en © Yu et al. 2015 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Yu, Xiang
Geer, Lewis Y.
Han, Lianyi
Bryant, Stephen H.
Target enhanced 2D similarity search by using explicit biological activity annotations and profiles
title Target enhanced 2D similarity search by using explicit biological activity annotations and profiles
title_full Target enhanced 2D similarity search by using explicit biological activity annotations and profiles
title_fullStr Target enhanced 2D similarity search by using explicit biological activity annotations and profiles
title_full_unstemmed Target enhanced 2D similarity search by using explicit biological activity annotations and profiles
title_short Target enhanced 2D similarity search by using explicit biological activity annotations and profiles
title_sort target enhanced 2d similarity search by using explicit biological activity annotations and profiles
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4648974/
https://www.ncbi.nlm.nih.gov/pubmed/26583046
http://dx.doi.org/10.1186/s13321-015-0103-5
work_keys_str_mv AT yuxiang targetenhanced2dsimilaritysearchbyusingexplicitbiologicalactivityannotationsandprofiles
AT geerlewisy targetenhanced2dsimilaritysearchbyusingexplicitbiologicalactivityannotationsandprofiles
AT hanlianyi targetenhanced2dsimilaritysearchbyusingexplicitbiologicalactivityannotationsandprofiles
AT bryantstephenh targetenhanced2dsimilaritysearchbyusingexplicitbiologicalactivityannotationsandprofiles