Cargando…

On the reliability and the limits of inference of amino acid sequence alignments

MOTIVATION: Alignments are correspondences between sequences. How reliable are alignments of amino acid sequences of proteins, and what inferences about protein relationships can be drawn? Using techniques not previously applied to these questions, by weighting every possible sequence alignment by i...

Descripción completa

Detalles Bibliográficos
Autores principales: Rajapaksa, Sandun, Sumanaweera, Dinithi, Lesk, Arthur M, Allison, Lloyd, Stuckey, Peter J, Garcia de la Banda, Maria, Abramson, David, Konagurthu, Arun S
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9235515/
https://www.ncbi.nlm.nih.gov/pubmed/35758808
http://dx.doi.org/10.1093/bioinformatics/btac247
_version_ 1784736328807612416
author Rajapaksa, Sandun
Sumanaweera, Dinithi
Lesk, Arthur M
Allison, Lloyd
Stuckey, Peter J
Garcia de la Banda, Maria
Abramson, David
Konagurthu, Arun S
author_facet Rajapaksa, Sandun
Sumanaweera, Dinithi
Lesk, Arthur M
Allison, Lloyd
Stuckey, Peter J
Garcia de la Banda, Maria
Abramson, David
Konagurthu, Arun S
author_sort Rajapaksa, Sandun
collection PubMed
description MOTIVATION: Alignments are correspondences between sequences. How reliable are alignments of amino acid sequences of proteins, and what inferences about protein relationships can be drawn? Using techniques not previously applied to these questions, by weighting every possible sequence alignment by its posterior probability we derive a formal mathematical expectation, and develop an efficient algorithm for computation of the distance between alternative alignments allowing quantitative comparisons of sequence-based alignments with corresponding reference structure alignments. RESULTS: By analyzing the sequences and structures of 1 million protein domain pairs, we report the variation of the expected distance between sequence-based and structure-based alignments, as a function of (Markov time of) sequence divergence. Our results clearly demarcate the ‘daylight’, ‘twilight’ and ‘midnight’ zones for interpreting residue–residue correspondences from sequence information alone. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-9235515
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-92355152022-06-29 On the reliability and the limits of inference of amino acid sequence alignments Rajapaksa, Sandun Sumanaweera, Dinithi Lesk, Arthur M Allison, Lloyd Stuckey, Peter J Garcia de la Banda, Maria Abramson, David Konagurthu, Arun S Bioinformatics ISCB/Ismb 2022 MOTIVATION: Alignments are correspondences between sequences. How reliable are alignments of amino acid sequences of proteins, and what inferences about protein relationships can be drawn? Using techniques not previously applied to these questions, by weighting every possible sequence alignment by its posterior probability we derive a formal mathematical expectation, and develop an efficient algorithm for computation of the distance between alternative alignments allowing quantitative comparisons of sequence-based alignments with corresponding reference structure alignments. RESULTS: By analyzing the sequences and structures of 1 million protein domain pairs, we report the variation of the expected distance between sequence-based and structure-based alignments, as a function of (Markov time of) sequence divergence. Our results clearly demarcate the ‘daylight’, ‘twilight’ and ‘midnight’ zones for interpreting residue–residue correspondences from sequence information alone. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2022-06-27 /pmc/articles/PMC9235515/ /pubmed/35758808 http://dx.doi.org/10.1093/bioinformatics/btac247 Text en © The Author(s) 2022. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle ISCB/Ismb 2022
Rajapaksa, Sandun
Sumanaweera, Dinithi
Lesk, Arthur M
Allison, Lloyd
Stuckey, Peter J
Garcia de la Banda, Maria
Abramson, David
Konagurthu, Arun S
On the reliability and the limits of inference of amino acid sequence alignments
title On the reliability and the limits of inference of amino acid sequence alignments
title_full On the reliability and the limits of inference of amino acid sequence alignments
title_fullStr On the reliability and the limits of inference of amino acid sequence alignments
title_full_unstemmed On the reliability and the limits of inference of amino acid sequence alignments
title_short On the reliability and the limits of inference of amino acid sequence alignments
title_sort on the reliability and the limits of inference of amino acid sequence alignments
topic ISCB/Ismb 2022
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9235515/
https://www.ncbi.nlm.nih.gov/pubmed/35758808
http://dx.doi.org/10.1093/bioinformatics/btac247
work_keys_str_mv AT rajapaksasandun onthereliabilityandthelimitsofinferenceofaminoacidsequencealignments
AT sumanaweeradinithi onthereliabilityandthelimitsofinferenceofaminoacidsequencealignments
AT leskarthurm onthereliabilityandthelimitsofinferenceofaminoacidsequencealignments
AT allisonlloyd onthereliabilityandthelimitsofinferenceofaminoacidsequencealignments
AT stuckeypeterj onthereliabilityandthelimitsofinferenceofaminoacidsequencealignments
AT garciadelabandamaria onthereliabilityandthelimitsofinferenceofaminoacidsequencealignments
AT abramsondavid onthereliabilityandthelimitsofinferenceofaminoacidsequencealignments
AT konagurthuaruns onthereliabilityandthelimitsofinferenceofaminoacidsequencealignments