Cargando…
On the reliability and the limits of inference of amino acid sequence alignments
MOTIVATION: Alignments are correspondences between sequences. How reliable are alignments of amino acid sequences of proteins, and what inferences about protein relationships can be drawn? Using techniques not previously applied to these questions, by weighting every possible sequence alignment by i...
Autores principales: | , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9235515/ https://www.ncbi.nlm.nih.gov/pubmed/35758808 http://dx.doi.org/10.1093/bioinformatics/btac247 |
_version_ | 1784736328807612416 |
---|---|
author | Rajapaksa, Sandun Sumanaweera, Dinithi Lesk, Arthur M Allison, Lloyd Stuckey, Peter J Garcia de la Banda, Maria Abramson, David Konagurthu, Arun S |
author_facet | Rajapaksa, Sandun Sumanaweera, Dinithi Lesk, Arthur M Allison, Lloyd Stuckey, Peter J Garcia de la Banda, Maria Abramson, David Konagurthu, Arun S |
author_sort | Rajapaksa, Sandun |
collection | PubMed |
description | MOTIVATION: Alignments are correspondences between sequences. How reliable are alignments of amino acid sequences of proteins, and what inferences about protein relationships can be drawn? Using techniques not previously applied to these questions, by weighting every possible sequence alignment by its posterior probability we derive a formal mathematical expectation, and develop an efficient algorithm for computation of the distance between alternative alignments allowing quantitative comparisons of sequence-based alignments with corresponding reference structure alignments. RESULTS: By analyzing the sequences and structures of 1 million protein domain pairs, we report the variation of the expected distance between sequence-based and structure-based alignments, as a function of (Markov time of) sequence divergence. Our results clearly demarcate the ‘daylight’, ‘twilight’ and ‘midnight’ zones for interpreting residue–residue correspondences from sequence information alone. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. |
format | Online Article Text |
id | pubmed-9235515 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-92355152022-06-29 On the reliability and the limits of inference of amino acid sequence alignments Rajapaksa, Sandun Sumanaweera, Dinithi Lesk, Arthur M Allison, Lloyd Stuckey, Peter J Garcia de la Banda, Maria Abramson, David Konagurthu, Arun S Bioinformatics ISCB/Ismb 2022 MOTIVATION: Alignments are correspondences between sequences. How reliable are alignments of amino acid sequences of proteins, and what inferences about protein relationships can be drawn? Using techniques not previously applied to these questions, by weighting every possible sequence alignment by its posterior probability we derive a formal mathematical expectation, and develop an efficient algorithm for computation of the distance between alternative alignments allowing quantitative comparisons of sequence-based alignments with corresponding reference structure alignments. RESULTS: By analyzing the sequences and structures of 1 million protein domain pairs, we report the variation of the expected distance between sequence-based and structure-based alignments, as a function of (Markov time of) sequence divergence. Our results clearly demarcate the ‘daylight’, ‘twilight’ and ‘midnight’ zones for interpreting residue–residue correspondences from sequence information alone. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2022-06-27 /pmc/articles/PMC9235515/ /pubmed/35758808 http://dx.doi.org/10.1093/bioinformatics/btac247 Text en © The Author(s) 2022. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | ISCB/Ismb 2022 Rajapaksa, Sandun Sumanaweera, Dinithi Lesk, Arthur M Allison, Lloyd Stuckey, Peter J Garcia de la Banda, Maria Abramson, David Konagurthu, Arun S On the reliability and the limits of inference of amino acid sequence alignments |
title | On the reliability and the limits of inference of amino acid sequence alignments |
title_full | On the reliability and the limits of inference of amino acid sequence alignments |
title_fullStr | On the reliability and the limits of inference of amino acid sequence alignments |
title_full_unstemmed | On the reliability and the limits of inference of amino acid sequence alignments |
title_short | On the reliability and the limits of inference of amino acid sequence alignments |
title_sort | on the reliability and the limits of inference of amino acid sequence alignments |
topic | ISCB/Ismb 2022 |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9235515/ https://www.ncbi.nlm.nih.gov/pubmed/35758808 http://dx.doi.org/10.1093/bioinformatics/btac247 |
work_keys_str_mv | AT rajapaksasandun onthereliabilityandthelimitsofinferenceofaminoacidsequencealignments AT sumanaweeradinithi onthereliabilityandthelimitsofinferenceofaminoacidsequencealignments AT leskarthurm onthereliabilityandthelimitsofinferenceofaminoacidsequencealignments AT allisonlloyd onthereliabilityandthelimitsofinferenceofaminoacidsequencealignments AT stuckeypeterj onthereliabilityandthelimitsofinferenceofaminoacidsequencealignments AT garciadelabandamaria onthereliabilityandthelimitsofinferenceofaminoacidsequencealignments AT abramsondavid onthereliabilityandthelimitsofinferenceofaminoacidsequencealignments AT konagurthuaruns onthereliabilityandthelimitsofinferenceofaminoacidsequencealignments |