Cargando…
eCOMPASS: evaluative comparison of multiple protein alignments by statistical score
MOTIVATION: Detecting subtle biologically relevant patterns in protein sequences often requires the construction of a large and accurate multiple sequence alignment (MSA). Methods for constructing MSAs are usually evaluated using benchmark alignments, which, however, typically contain very few seque...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8545322/ https://www.ncbi.nlm.nih.gov/pubmed/33983436 http://dx.doi.org/10.1093/bioinformatics/btab374 |
_version_ | 1784589991887765504 |
---|---|
author | Neuwald, Andrew F Kolaczkowski, Bryan D Altschul, Stephen F |
author_facet | Neuwald, Andrew F Kolaczkowski, Bryan D Altschul, Stephen F |
author_sort | Neuwald, Andrew F |
collection | PubMed |
description | MOTIVATION: Detecting subtle biologically relevant patterns in protein sequences often requires the construction of a large and accurate multiple sequence alignment (MSA). Methods for constructing MSAs are usually evaluated using benchmark alignments, which, however, typically contain very few sequences and are therefore inappropriate when dealing with large numbers of proteins. RESULTS: eCOMPASS addresses this problem using a statistical measure of relative alignment quality based on direct coupling analysis (DCA): to maintain protein structural integrity over evolutionary time, substitutions at one residue position typically result in compensating substitutions at other positions. eCOMPASS computes the statistical significance of the congruence between high scoring directly coupled pairs and 3D contacts in corresponding structures, which depends upon properly aligned homologous residues. We illustrate eCOMPASS using both simulated and real MSAs. AVAILABILITY AND IMPLEMENTATION: The eCOMPASS executable, C++ open source code and input data sets are available at https://www.igs.umaryland.edu/labs/neuwald/software/compass SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. |
format | Online Article Text |
id | pubmed-8545322 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-85453222021-10-26 eCOMPASS: evaluative comparison of multiple protein alignments by statistical score Neuwald, Andrew F Kolaczkowski, Bryan D Altschul, Stephen F Bioinformatics Original Papers MOTIVATION: Detecting subtle biologically relevant patterns in protein sequences often requires the construction of a large and accurate multiple sequence alignment (MSA). Methods for constructing MSAs are usually evaluated using benchmark alignments, which, however, typically contain very few sequences and are therefore inappropriate when dealing with large numbers of proteins. RESULTS: eCOMPASS addresses this problem using a statistical measure of relative alignment quality based on direct coupling analysis (DCA): to maintain protein structural integrity over evolutionary time, substitutions at one residue position typically result in compensating substitutions at other positions. eCOMPASS computes the statistical significance of the congruence between high scoring directly coupled pairs and 3D contacts in corresponding structures, which depends upon properly aligned homologous residues. We illustrate eCOMPASS using both simulated and real MSAs. AVAILABILITY AND IMPLEMENTATION: The eCOMPASS executable, C++ open source code and input data sets are available at https://www.igs.umaryland.edu/labs/neuwald/software/compass SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2021-05-13 /pmc/articles/PMC8545322/ /pubmed/33983436 http://dx.doi.org/10.1093/bioinformatics/btab374 Text en © The Author(s) 2021. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Original Papers Neuwald, Andrew F Kolaczkowski, Bryan D Altschul, Stephen F eCOMPASS: evaluative comparison of multiple protein alignments by statistical score |
title | eCOMPASS: evaluative comparison of multiple protein alignments by statistical score |
title_full | eCOMPASS: evaluative comparison of multiple protein alignments by statistical score |
title_fullStr | eCOMPASS: evaluative comparison of multiple protein alignments by statistical score |
title_full_unstemmed | eCOMPASS: evaluative comparison of multiple protein alignments by statistical score |
title_short | eCOMPASS: evaluative comparison of multiple protein alignments by statistical score |
title_sort | ecompass: evaluative comparison of multiple protein alignments by statistical score |
topic | Original Papers |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8545322/ https://www.ncbi.nlm.nih.gov/pubmed/33983436 http://dx.doi.org/10.1093/bioinformatics/btab374 |
work_keys_str_mv | AT neuwaldandrewf ecompassevaluativecomparisonofmultipleproteinalignmentsbystatisticalscore AT kolaczkowskibryand ecompassevaluativecomparisonofmultipleproteinalignmentsbystatisticalscore AT altschulstephenf ecompassevaluativecomparisonofmultipleproteinalignmentsbystatisticalscore |