Cargando…
CMsearch: simultaneous exploration of protein sequence space and structure space improves not only protein homology detection but also protein structure prediction
Motivation: Protein homology detection, a fundamental problem in computational biology, is an indispensable step toward predicting protein structures and understanding protein functions. Despite the advances in recent decades on sequence alignment, threading and alignment-free methods, protein homol...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4908355/ https://www.ncbi.nlm.nih.gov/pubmed/27307635 http://dx.doi.org/10.1093/bioinformatics/btw271 |
_version_ | 1782437666565914624 |
---|---|
author | Cui, Xuefeng Lu, Zhiwu Wang, Sheng Jing-Yan Wang, Jim Gao, Xin |
author_facet | Cui, Xuefeng Lu, Zhiwu Wang, Sheng Jing-Yan Wang, Jim Gao, Xin |
author_sort | Cui, Xuefeng |
collection | PubMed |
description | Motivation: Protein homology detection, a fundamental problem in computational biology, is an indispensable step toward predicting protein structures and understanding protein functions. Despite the advances in recent decades on sequence alignment, threading and alignment-free methods, protein homology detection remains a challenging open problem. Recently, network methods that try to find transitive paths in the protein structure space demonstrate the importance of incorporating network information of the structure space. Yet, current methods merge the sequence space and the structure space into a single space, and thus introduce inconsistency in combining different sources of information. Method: We present a novel network-based protein homology detection method, CMsearch, based on cross-modal learning. Instead of exploring a single network built from the mixture of sequence and structure space information, CMsearch builds two separate networks to represent the sequence space and the structure space. It then learns sequence–structure correlation by simultaneously taking sequence information, structure information, sequence space information and structure space information into consideration. Results: We tested CMsearch on two challenging tasks, protein homology detection and protein structure prediction, by querying all 8332 PDB40 proteins. Our results demonstrate that CMsearch is insensitive to the similarity metrics used to define the sequence and the structure spaces. By using HMM–HMM alignment as the sequence similarity metric, CMsearch clearly outperforms state-of-the-art homology detection methods and the CASP-winning template-based protein structure prediction methods. Availability and implementation: Our program is freely available for download from http://sfb.kaust.edu.sa/Pages/Software.aspx. Contact: xin.gao@kaust.edu.sa Supplementary information: Supplementary data are available at Bioinformatics online. |
format | Online Article Text |
id | pubmed-4908355 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-49083552016-06-17 CMsearch: simultaneous exploration of protein sequence space and structure space improves not only protein homology detection but also protein structure prediction Cui, Xuefeng Lu, Zhiwu Wang, Sheng Jing-Yan Wang, Jim Gao, Xin Bioinformatics Ismb 2016 Proceedings July 8 to July 12, 2016, Orlando, Florida Motivation: Protein homology detection, a fundamental problem in computational biology, is an indispensable step toward predicting protein structures and understanding protein functions. Despite the advances in recent decades on sequence alignment, threading and alignment-free methods, protein homology detection remains a challenging open problem. Recently, network methods that try to find transitive paths in the protein structure space demonstrate the importance of incorporating network information of the structure space. Yet, current methods merge the sequence space and the structure space into a single space, and thus introduce inconsistency in combining different sources of information. Method: We present a novel network-based protein homology detection method, CMsearch, based on cross-modal learning. Instead of exploring a single network built from the mixture of sequence and structure space information, CMsearch builds two separate networks to represent the sequence space and the structure space. It then learns sequence–structure correlation by simultaneously taking sequence information, structure information, sequence space information and structure space information into consideration. Results: We tested CMsearch on two challenging tasks, protein homology detection and protein structure prediction, by querying all 8332 PDB40 proteins. Our results demonstrate that CMsearch is insensitive to the similarity metrics used to define the sequence and the structure spaces. By using HMM–HMM alignment as the sequence similarity metric, CMsearch clearly outperforms state-of-the-art homology detection methods and the CASP-winning template-based protein structure prediction methods. Availability and implementation: Our program is freely available for download from http://sfb.kaust.edu.sa/Pages/Software.aspx. Contact: xin.gao@kaust.edu.sa Supplementary information: Supplementary data are available at Bioinformatics online. Oxford University Press 2016-06-15 2016-06-11 /pmc/articles/PMC4908355/ /pubmed/27307635 http://dx.doi.org/10.1093/bioinformatics/btw271 Text en © The Author 2016. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Ismb 2016 Proceedings July 8 to July 12, 2016, Orlando, Florida Cui, Xuefeng Lu, Zhiwu Wang, Sheng Jing-Yan Wang, Jim Gao, Xin CMsearch: simultaneous exploration of protein sequence space and structure space improves not only protein homology detection but also protein structure prediction |
title | CMsearch: simultaneous exploration of protein sequence space and structure space improves not only protein homology detection but also protein structure prediction |
title_full | CMsearch: simultaneous exploration of protein sequence space and structure space improves not only protein homology detection but also protein structure prediction |
title_fullStr | CMsearch: simultaneous exploration of protein sequence space and structure space improves not only protein homology detection but also protein structure prediction |
title_full_unstemmed | CMsearch: simultaneous exploration of protein sequence space and structure space improves not only protein homology detection but also protein structure prediction |
title_short | CMsearch: simultaneous exploration of protein sequence space and structure space improves not only protein homology detection but also protein structure prediction |
title_sort | cmsearch: simultaneous exploration of protein sequence space and structure space improves not only protein homology detection but also protein structure prediction |
topic | Ismb 2016 Proceedings July 8 to July 12, 2016, Orlando, Florida |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4908355/ https://www.ncbi.nlm.nih.gov/pubmed/27307635 http://dx.doi.org/10.1093/bioinformatics/btw271 |
work_keys_str_mv | AT cuixuefeng cmsearchsimultaneousexplorationofproteinsequencespaceandstructurespaceimprovesnotonlyproteinhomologydetectionbutalsoproteinstructureprediction AT luzhiwu cmsearchsimultaneousexplorationofproteinsequencespaceandstructurespaceimprovesnotonlyproteinhomologydetectionbutalsoproteinstructureprediction AT wangsheng cmsearchsimultaneousexplorationofproteinsequencespaceandstructurespaceimprovesnotonlyproteinhomologydetectionbutalsoproteinstructureprediction AT jingyanwangjim cmsearchsimultaneousexplorationofproteinsequencespaceandstructurespaceimprovesnotonlyproteinhomologydetectionbutalsoproteinstructureprediction AT gaoxin cmsearchsimultaneousexplorationofproteinsequencespaceandstructurespaceimprovesnotonlyproteinhomologydetectionbutalsoproteinstructureprediction |