Cargando…

Evolutionary inaccuracy of pairwise structural alignments

Motivation: Structural alignment methods are widely used to generate gold standard alignments for improving multiple sequence alignments and transferring functional annotations, as well as for assigning structural distances between proteins. However, the correctness of the alignments generated by th...

Descripción completa

Detalles Bibliográficos
Autores principales: Sadowski, M. I., Taylor, W. R.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3338010/
https://www.ncbi.nlm.nih.gov/pubmed/22399676
http://dx.doi.org/10.1093/bioinformatics/bts103
_version_ 1782231145915613184
author Sadowski, M. I.
Taylor, W. R.
author_facet Sadowski, M. I.
Taylor, W. R.
author_sort Sadowski, M. I.
collection PubMed
description Motivation: Structural alignment methods are widely used to generate gold standard alignments for improving multiple sequence alignments and transferring functional annotations, as well as for assigning structural distances between proteins. However, the correctness of the alignments generated by these methods is difficult to assess objectively since little is known about the exact evolutionary history of most proteins. Since homology is an equivalence relation, an upper bound on alignment quality can be found by assessing the consistency of alignments. Measuring the consistency of current methods of structure alignment and determining the causes of inconsistencies can, therefore, provide information on the quality of current methods and suggest possibilities for further improvement. Results: We analyze the self-consistency of seven widely-used structural alignment methods (SAP, TM-align, Fr-TM-align, MAMMOTH, DALI, CE and FATCAT) on a diverse, non-redundant set of 1863 domains from the SCOP database and demonstrate that even for relatively similar proteins the degree of inconsistency of the alignments on a residue level is high (30%). We further show that levels of consistency vary substantially between methods, with two methods (SAP and Fr-TM-align) producing more consistent alignments than the rest. Inconsistency is found to be higher near gaps and for proteins of low structural complexity, as well as for helices. The ability of the methods to identify good structural alignments is also assessed using geometric measures, for which FATCAT (flexible mode) is found to be the best performer despite being highly inconsistent. We conclude that there is substantial scope for improving the consistency of structural alignment methods. Contact: msadows@nimr.mrc.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-3338010
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-33380102012-04-27 Evolutionary inaccuracy of pairwise structural alignments Sadowski, M. I. Taylor, W. R. Bioinformatics Original Papers Motivation: Structural alignment methods are widely used to generate gold standard alignments for improving multiple sequence alignments and transferring functional annotations, as well as for assigning structural distances between proteins. However, the correctness of the alignments generated by these methods is difficult to assess objectively since little is known about the exact evolutionary history of most proteins. Since homology is an equivalence relation, an upper bound on alignment quality can be found by assessing the consistency of alignments. Measuring the consistency of current methods of structure alignment and determining the causes of inconsistencies can, therefore, provide information on the quality of current methods and suggest possibilities for further improvement. Results: We analyze the self-consistency of seven widely-used structural alignment methods (SAP, TM-align, Fr-TM-align, MAMMOTH, DALI, CE and FATCAT) on a diverse, non-redundant set of 1863 domains from the SCOP database and demonstrate that even for relatively similar proteins the degree of inconsistency of the alignments on a residue level is high (30%). We further show that levels of consistency vary substantially between methods, with two methods (SAP and Fr-TM-align) producing more consistent alignments than the rest. Inconsistency is found to be higher near gaps and for proteins of low structural complexity, as well as for helices. The ability of the methods to identify good structural alignments is also assessed using geometric measures, for which FATCAT (flexible mode) is found to be the best performer despite being highly inconsistent. We conclude that there is substantial scope for improving the consistency of structural alignment methods. Contact: msadows@nimr.mrc.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online. Oxford University Press 2012-05-01 2012-03-06 /pmc/articles/PMC3338010/ /pubmed/22399676 http://dx.doi.org/10.1093/bioinformatics/bts103 Text en © The Author(s) 2012. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/3.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Papers
Sadowski, M. I.
Taylor, W. R.
Evolutionary inaccuracy of pairwise structural alignments
title Evolutionary inaccuracy of pairwise structural alignments
title_full Evolutionary inaccuracy of pairwise structural alignments
title_fullStr Evolutionary inaccuracy of pairwise structural alignments
title_full_unstemmed Evolutionary inaccuracy of pairwise structural alignments
title_short Evolutionary inaccuracy of pairwise structural alignments
title_sort evolutionary inaccuracy of pairwise structural alignments
topic Original Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3338010/
https://www.ncbi.nlm.nih.gov/pubmed/22399676
http://dx.doi.org/10.1093/bioinformatics/bts103
work_keys_str_mv AT sadowskimi evolutionaryinaccuracyofpairwisestructuralalignments
AT taylorwr evolutionaryinaccuracyofpairwisestructuralalignments