Cargando…

MSACompro: protein multiple sequence alignment using predicted secondary structure, solvent accessibility, and residue-residue contacts

BACKGROUND: Multiple Sequence Alignment (MSA) is a basic tool for bioinformatics research and analysis. It has been used essentially in almost all bioinformatics tasks such as protein structure modeling, gene and protein function prediction, DNA motif recognition, and phylogenetic analysis. Therefor...

Descripción completa

Detalles Bibliográficos
Autores principales: Deng, Xin, Cheng, Jianlin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3299741/
https://www.ncbi.nlm.nih.gov/pubmed/22168237
http://dx.doi.org/10.1186/1471-2105-12-472
_version_ 1782226161909104640
author Deng, Xin
Cheng, Jianlin
author_facet Deng, Xin
Cheng, Jianlin
author_sort Deng, Xin
collection PubMed
description BACKGROUND: Multiple Sequence Alignment (MSA) is a basic tool for bioinformatics research and analysis. It has been used essentially in almost all bioinformatics tasks such as protein structure modeling, gene and protein function prediction, DNA motif recognition, and phylogenetic analysis. Therefore, improving the accuracy of multiple sequence alignment is important for advancing many bioinformatics fields. RESULTS: We designed and developed a new method, MSACompro, to synergistically incorporate predicted secondary structure, relative solvent accessibility, and residue-residue contact information into the currently most accurate posterior probability-based MSA methods to improve the accuracy of multiple sequence alignments. The method is different from the multiple sequence alignment methods (e.g. 3D-Coffee) that use the tertiary structure information of some sequences since the structural information of our method is fully predicted from sequences. To the best of our knowledge, applying predicted relative solvent accessibility and contact map to multiple sequence alignment is novel. The rigorous benchmarking of our method to the standard benchmarks (i.e. BAliBASE, SABmark and OXBENCH) clearly demonstrated that incorporating predicted protein structural information improves the multiple sequence alignment accuracy over the leading multiple protein sequence alignment tools without using this information, such as MSAProbs, ProbCons, Probalign, T-coffee, MAFFT and MUSCLE. And the performance of the method is comparable to the state-of-the-art method PROMALS of using structural features and additional homologous sequences by slightly lower scores. CONCLUSION: MSACompro is an efficient and reliable multiple protein sequence alignment tool that can effectively incorporate predicted protein structural information into multiple sequence alignment. The software is available at http://sysbio.rnet.missouri.edu/multicom_toolbox/.
format Online
Article
Text
id pubmed-3299741
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-32997412012-03-14 MSACompro: protein multiple sequence alignment using predicted secondary structure, solvent accessibility, and residue-residue contacts Deng, Xin Cheng, Jianlin BMC Bioinformatics Research Article BACKGROUND: Multiple Sequence Alignment (MSA) is a basic tool for bioinformatics research and analysis. It has been used essentially in almost all bioinformatics tasks such as protein structure modeling, gene and protein function prediction, DNA motif recognition, and phylogenetic analysis. Therefore, improving the accuracy of multiple sequence alignment is important for advancing many bioinformatics fields. RESULTS: We designed and developed a new method, MSACompro, to synergistically incorporate predicted secondary structure, relative solvent accessibility, and residue-residue contact information into the currently most accurate posterior probability-based MSA methods to improve the accuracy of multiple sequence alignments. The method is different from the multiple sequence alignment methods (e.g. 3D-Coffee) that use the tertiary structure information of some sequences since the structural information of our method is fully predicted from sequences. To the best of our knowledge, applying predicted relative solvent accessibility and contact map to multiple sequence alignment is novel. The rigorous benchmarking of our method to the standard benchmarks (i.e. BAliBASE, SABmark and OXBENCH) clearly demonstrated that incorporating predicted protein structural information improves the multiple sequence alignment accuracy over the leading multiple protein sequence alignment tools without using this information, such as MSAProbs, ProbCons, Probalign, T-coffee, MAFFT and MUSCLE. And the performance of the method is comparable to the state-of-the-art method PROMALS of using structural features and additional homologous sequences by slightly lower scores. CONCLUSION: MSACompro is an efficient and reliable multiple protein sequence alignment tool that can effectively incorporate predicted protein structural information into multiple sequence alignment. The software is available at http://sysbio.rnet.missouri.edu/multicom_toolbox/. BioMed Central 2011-12-14 /pmc/articles/PMC3299741/ /pubmed/22168237 http://dx.doi.org/10.1186/1471-2105-12-472 Text en Copyright ©2011 Deng and Cheng; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Deng, Xin
Cheng, Jianlin
MSACompro: protein multiple sequence alignment using predicted secondary structure, solvent accessibility, and residue-residue contacts
title MSACompro: protein multiple sequence alignment using predicted secondary structure, solvent accessibility, and residue-residue contacts
title_full MSACompro: protein multiple sequence alignment using predicted secondary structure, solvent accessibility, and residue-residue contacts
title_fullStr MSACompro: protein multiple sequence alignment using predicted secondary structure, solvent accessibility, and residue-residue contacts
title_full_unstemmed MSACompro: protein multiple sequence alignment using predicted secondary structure, solvent accessibility, and residue-residue contacts
title_short MSACompro: protein multiple sequence alignment using predicted secondary structure, solvent accessibility, and residue-residue contacts
title_sort msacompro: protein multiple sequence alignment using predicted secondary structure, solvent accessibility, and residue-residue contacts
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3299741/
https://www.ncbi.nlm.nih.gov/pubmed/22168237
http://dx.doi.org/10.1186/1471-2105-12-472
work_keys_str_mv AT dengxin msacomproproteinmultiplesequencealignmentusingpredictedsecondarystructuresolventaccessibilityandresidueresiduecontacts
AT chengjianlin msacomproproteinmultiplesequencealignmentusingpredictedsecondarystructuresolventaccessibilityandresidueresiduecontacts