Cargando…
MSACompro: protein multiple sequence alignment using predicted secondary structure, solvent accessibility, and residue-residue contacts
BACKGROUND: Multiple Sequence Alignment (MSA) is a basic tool for bioinformatics research and analysis. It has been used essentially in almost all bioinformatics tasks such as protein structure modeling, gene and protein function prediction, DNA motif recognition, and phylogenetic analysis. Therefor...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2011
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3299741/ https://www.ncbi.nlm.nih.gov/pubmed/22168237 http://dx.doi.org/10.1186/1471-2105-12-472 |
_version_ | 1782226161909104640 |
---|---|
author | Deng, Xin Cheng, Jianlin |
author_facet | Deng, Xin Cheng, Jianlin |
author_sort | Deng, Xin |
collection | PubMed |
description | BACKGROUND: Multiple Sequence Alignment (MSA) is a basic tool for bioinformatics research and analysis. It has been used essentially in almost all bioinformatics tasks such as protein structure modeling, gene and protein function prediction, DNA motif recognition, and phylogenetic analysis. Therefore, improving the accuracy of multiple sequence alignment is important for advancing many bioinformatics fields. RESULTS: We designed and developed a new method, MSACompro, to synergistically incorporate predicted secondary structure, relative solvent accessibility, and residue-residue contact information into the currently most accurate posterior probability-based MSA methods to improve the accuracy of multiple sequence alignments. The method is different from the multiple sequence alignment methods (e.g. 3D-Coffee) that use the tertiary structure information of some sequences since the structural information of our method is fully predicted from sequences. To the best of our knowledge, applying predicted relative solvent accessibility and contact map to multiple sequence alignment is novel. The rigorous benchmarking of our method to the standard benchmarks (i.e. BAliBASE, SABmark and OXBENCH) clearly demonstrated that incorporating predicted protein structural information improves the multiple sequence alignment accuracy over the leading multiple protein sequence alignment tools without using this information, such as MSAProbs, ProbCons, Probalign, T-coffee, MAFFT and MUSCLE. And the performance of the method is comparable to the state-of-the-art method PROMALS of using structural features and additional homologous sequences by slightly lower scores. CONCLUSION: MSACompro is an efficient and reliable multiple protein sequence alignment tool that can effectively incorporate predicted protein structural information into multiple sequence alignment. The software is available at http://sysbio.rnet.missouri.edu/multicom_toolbox/. |
format | Online Article Text |
id | pubmed-3299741 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2011 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-32997412012-03-14 MSACompro: protein multiple sequence alignment using predicted secondary structure, solvent accessibility, and residue-residue contacts Deng, Xin Cheng, Jianlin BMC Bioinformatics Research Article BACKGROUND: Multiple Sequence Alignment (MSA) is a basic tool for bioinformatics research and analysis. It has been used essentially in almost all bioinformatics tasks such as protein structure modeling, gene and protein function prediction, DNA motif recognition, and phylogenetic analysis. Therefore, improving the accuracy of multiple sequence alignment is important for advancing many bioinformatics fields. RESULTS: We designed and developed a new method, MSACompro, to synergistically incorporate predicted secondary structure, relative solvent accessibility, and residue-residue contact information into the currently most accurate posterior probability-based MSA methods to improve the accuracy of multiple sequence alignments. The method is different from the multiple sequence alignment methods (e.g. 3D-Coffee) that use the tertiary structure information of some sequences since the structural information of our method is fully predicted from sequences. To the best of our knowledge, applying predicted relative solvent accessibility and contact map to multiple sequence alignment is novel. The rigorous benchmarking of our method to the standard benchmarks (i.e. BAliBASE, SABmark and OXBENCH) clearly demonstrated that incorporating predicted protein structural information improves the multiple sequence alignment accuracy over the leading multiple protein sequence alignment tools without using this information, such as MSAProbs, ProbCons, Probalign, T-coffee, MAFFT and MUSCLE. And the performance of the method is comparable to the state-of-the-art method PROMALS of using structural features and additional homologous sequences by slightly lower scores. CONCLUSION: MSACompro is an efficient and reliable multiple protein sequence alignment tool that can effectively incorporate predicted protein structural information into multiple sequence alignment. The software is available at http://sysbio.rnet.missouri.edu/multicom_toolbox/. BioMed Central 2011-12-14 /pmc/articles/PMC3299741/ /pubmed/22168237 http://dx.doi.org/10.1186/1471-2105-12-472 Text en Copyright ©2011 Deng and Cheng; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Deng, Xin Cheng, Jianlin MSACompro: protein multiple sequence alignment using predicted secondary structure, solvent accessibility, and residue-residue contacts |
title | MSACompro: protein multiple sequence alignment using predicted secondary structure, solvent accessibility, and residue-residue contacts |
title_full | MSACompro: protein multiple sequence alignment using predicted secondary structure, solvent accessibility, and residue-residue contacts |
title_fullStr | MSACompro: protein multiple sequence alignment using predicted secondary structure, solvent accessibility, and residue-residue contacts |
title_full_unstemmed | MSACompro: protein multiple sequence alignment using predicted secondary structure, solvent accessibility, and residue-residue contacts |
title_short | MSACompro: protein multiple sequence alignment using predicted secondary structure, solvent accessibility, and residue-residue contacts |
title_sort | msacompro: protein multiple sequence alignment using predicted secondary structure, solvent accessibility, and residue-residue contacts |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3299741/ https://www.ncbi.nlm.nih.gov/pubmed/22168237 http://dx.doi.org/10.1186/1471-2105-12-472 |
work_keys_str_mv | AT dengxin msacomproproteinmultiplesequencealignmentusingpredictedsecondarystructuresolventaccessibilityandresidueresiduecontacts AT chengjianlin msacomproproteinmultiplesequencealignmentusingpredictedsecondarystructuresolventaccessibilityandresidueresiduecontacts |