Cargando…

A maximum common substructure-based algorithm for searching and predicting drug-like compounds

Motivation: The prediction of biologically active compounds is of great importance for high-throughput screening (HTS) approaches in drug discovery and chemical genomics. Many computational methods in this area focus on measuring the structural similarities between chemical structures. However, trad...

Descripción completa

Detalles Bibliográficos
Autores principales:	Cao, Yiqun, Jiang, Tao, Girke, Thomas
Formato:	Texto
Lenguaje:	English
Publicado:	Oxford University Press 2008
Materias:	Ismb 2008 Conference Proceedings 19–23 July 2008, Toronto
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2718661/ https://www.ncbi.nlm.nih.gov/pubmed/18586736 http://dx.doi.org/10.1093/bioinformatics/btn186

_version_	1782170010755530752
author	Cao, Yiqun Jiang, Tao Girke, Thomas
author_facet	Cao, Yiqun Jiang, Tao Girke, Thomas
author_sort	Cao, Yiqun
collection	PubMed
description	Motivation: The prediction of biologically active compounds is of great importance for high-throughput screening (HTS) approaches in drug discovery and chemical genomics. Many computational methods in this area focus on measuring the structural similarities between chemical structures. However, traditional similarity measures are often too rigid or consider only global similarities between structures. The maximum common substructure (MCS) approach provides a more promising and flexible alternative for predicting bioactive compounds. Results: In this article, a new backtracking algorithm for MCS is proposed and compared to global similarity measurements. Our algorithm provides high flexibility in the matching process, and it is very efficient in identifying local structural similarities. To predict and cluster biologically active compounds more efficiently, the concept of basis compounds is proposed that enables researchers to easily combine the MCS-based and traditional similarity measures with modern machine learning techniques. Support vector machines (SVMs) are used to test how the MCS-based similarity measure and the basis compound vectorization method perform on two empirically tested datasets. The test results show that MCS complements the well-known atom pair descriptor-based similarity measure. By combining these two measures, our SVM-based model predicts the biological activities of chemical compounds with higher specificity and sensitivity. Contact:ycao@cs.ucr.edu Supplementary information: Supplementary data are available at Bioinformatics online.
format	Text
id	pubmed-2718661
institution	National Center for Biotechnology Information
language	English
publishDate	2008
publisher	Oxford University Press
record_format	MEDLINE/PubMed
spelling	pubmed-27186612009-07-31 A maximum common substructure-based algorithm for searching and predicting drug-like compounds Cao, Yiqun Jiang, Tao Girke, Thomas Bioinformatics Ismb 2008 Conference Proceedings 19–23 July 2008, Toronto Motivation: The prediction of biologically active compounds is of great importance for high-throughput screening (HTS) approaches in drug discovery and chemical genomics. Many computational methods in this area focus on measuring the structural similarities between chemical structures. However, traditional similarity measures are often too rigid or consider only global similarities between structures. The maximum common substructure (MCS) approach provides a more promising and flexible alternative for predicting bioactive compounds. Results: In this article, a new backtracking algorithm for MCS is proposed and compared to global similarity measurements. Our algorithm provides high flexibility in the matching process, and it is very efficient in identifying local structural similarities. To predict and cluster biologically active compounds more efficiently, the concept of basis compounds is proposed that enables researchers to easily combine the MCS-based and traditional similarity measures with modern machine learning techniques. Support vector machines (SVMs) are used to test how the MCS-based similarity measure and the basis compound vectorization method perform on two empirically tested datasets. The test results show that MCS complements the well-known atom pair descriptor-based similarity measure. By combining these two measures, our SVM-based model predicts the biological activities of chemical compounds with higher specificity and sensitivity. Contact:ycao@cs.ucr.edu Supplementary information: Supplementary data are available at Bioinformatics online. Oxford University Press 2008-07-01 /pmc/articles/PMC2718661/ /pubmed/18586736 http://dx.doi.org/10.1093/bioinformatics/btn186 Text en © 2008 The Author(s) http://creativecommons.org/licenses/by-nc/2.0/uk/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Ismb 2008 Conference Proceedings 19–23 July 2008, Toronto Cao, Yiqun Jiang, Tao Girke, Thomas A maximum common substructure-based algorithm for searching and predicting drug-like compounds
title	A maximum common substructure-based algorithm for searching and predicting drug-like compounds
title_full	A maximum common substructure-based algorithm for searching and predicting drug-like compounds
title_fullStr	A maximum common substructure-based algorithm for searching and predicting drug-like compounds
title_full_unstemmed	A maximum common substructure-based algorithm for searching and predicting drug-like compounds
title_short	A maximum common substructure-based algorithm for searching and predicting drug-like compounds
title_sort	maximum common substructure-based algorithm for searching and predicting drug-like compounds
topic	Ismb 2008 Conference Proceedings 19–23 July 2008, Toronto
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2718661/ https://www.ncbi.nlm.nih.gov/pubmed/18586736 http://dx.doi.org/10.1093/bioinformatics/btn186
work_keys_str_mv	AT caoyiqun amaximumcommonsubstructurebasedalgorithmforsearchingandpredictingdruglikecompounds AT jiangtao amaximumcommonsubstructurebasedalgorithmforsearchingandpredictingdruglikecompounds AT girkethomas amaximumcommonsubstructurebasedalgorithmforsearchingandpredictingdruglikecompounds AT caoyiqun maximumcommonsubstructurebasedalgorithmforsearchingandpredictingdruglikecompounds AT jiangtao maximumcommonsubstructurebasedalgorithmforsearchingandpredictingdruglikecompounds AT girkethomas maximumcommonsubstructurebasedalgorithmforsearchingandpredictingdruglikecompounds

A maximum common substructure-based algorithm for searching and predicting drug-like compounds

Ejemplares similares