Cargando…

Mixture of experts models to exploit global sequence similarity on biomolecular sequence labeling

BACKGROUND: Identification of functionally important sites in biomolecular sequences has broad applications ranging from rational drug design to the analysis of metabolic and signal transduction networks. Experimental determination of such sites lags far behind the number of known biomolecular seque...

Descripción completa

Detalles Bibliográficos
Autores principales: Caragea, Cornelia, Sinapov, Jivko, Dobbs, Drena, Honavar, Vasant
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2681071/
https://www.ncbi.nlm.nih.gov/pubmed/19426452
http://dx.doi.org/10.1186/1471-2105-10-S4-S4
_version_ 1782167010352824320
author Caragea, Cornelia
Sinapov, Jivko
Dobbs, Drena
Honavar, Vasant
author_facet Caragea, Cornelia
Sinapov, Jivko
Dobbs, Drena
Honavar, Vasant
author_sort Caragea, Cornelia
collection PubMed
description BACKGROUND: Identification of functionally important sites in biomolecular sequences has broad applications ranging from rational drug design to the analysis of metabolic and signal transduction networks. Experimental determination of such sites lags far behind the number of known biomolecular sequences. Hence, there is a need to develop reliable computational methods for identifying functionally important sites from biomolecular sequences. RESULTS: We present a mixture of experts approach to biomolecular sequence labeling that takes into account the global similarity between biomolecular sequences. Our approach combines unsupervised and supervised learning techniques. Given a set of sequences and a similarity measure defined on pairs of sequences, we learn a mixture of experts model by using spectral clustering to learn the hierarchical structure of the model and by using bayesian techniques to combine the predictions of the experts. We evaluate our approach on two biomolecular sequence labeling problems: RNA-protein and DNA-protein interface prediction problems. The results of our experiments show that global sequence similarity can be exploited to improve the performance of classifiers trained to label biomolecular sequence data. CONCLUSION: The mixture of experts model helps improve the performance of machine learning methods for identifying functionally important sites in biomolecular sequences.
format Text
id pubmed-2681071
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-26810712009-05-13 Mixture of experts models to exploit global sequence similarity on biomolecular sequence labeling Caragea, Cornelia Sinapov, Jivko Dobbs, Drena Honavar, Vasant BMC Bioinformatics Proceedings BACKGROUND: Identification of functionally important sites in biomolecular sequences has broad applications ranging from rational drug design to the analysis of metabolic and signal transduction networks. Experimental determination of such sites lags far behind the number of known biomolecular sequences. Hence, there is a need to develop reliable computational methods for identifying functionally important sites from biomolecular sequences. RESULTS: We present a mixture of experts approach to biomolecular sequence labeling that takes into account the global similarity between biomolecular sequences. Our approach combines unsupervised and supervised learning techniques. Given a set of sequences and a similarity measure defined on pairs of sequences, we learn a mixture of experts model by using spectral clustering to learn the hierarchical structure of the model and by using bayesian techniques to combine the predictions of the experts. We evaluate our approach on two biomolecular sequence labeling problems: RNA-protein and DNA-protein interface prediction problems. The results of our experiments show that global sequence similarity can be exploited to improve the performance of classifiers trained to label biomolecular sequence data. CONCLUSION: The mixture of experts model helps improve the performance of machine learning methods for identifying functionally important sites in biomolecular sequences. BioMed Central 2009-04-29 /pmc/articles/PMC2681071/ /pubmed/19426452 http://dx.doi.org/10.1186/1471-2105-10-S4-S4 Text en Copyright © 2009 Caragea et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Proceedings
Caragea, Cornelia
Sinapov, Jivko
Dobbs, Drena
Honavar, Vasant
Mixture of experts models to exploit global sequence similarity on biomolecular sequence labeling
title Mixture of experts models to exploit global sequence similarity on biomolecular sequence labeling
title_full Mixture of experts models to exploit global sequence similarity on biomolecular sequence labeling
title_fullStr Mixture of experts models to exploit global sequence similarity on biomolecular sequence labeling
title_full_unstemmed Mixture of experts models to exploit global sequence similarity on biomolecular sequence labeling
title_short Mixture of experts models to exploit global sequence similarity on biomolecular sequence labeling
title_sort mixture of experts models to exploit global sequence similarity on biomolecular sequence labeling
topic Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2681071/
https://www.ncbi.nlm.nih.gov/pubmed/19426452
http://dx.doi.org/10.1186/1471-2105-10-S4-S4
work_keys_str_mv AT carageacornelia mixtureofexpertsmodelstoexploitglobalsequencesimilarityonbiomolecularsequencelabeling
AT sinapovjivko mixtureofexpertsmodelstoexploitglobalsequencesimilarityonbiomolecularsequencelabeling
AT dobbsdrena mixtureofexpertsmodelstoexploitglobalsequencesimilarityonbiomolecularsequencelabeling
AT honavarvasant mixtureofexpertsmodelstoexploitglobalsequencesimilarityonbiomolecularsequencelabeling