Cargando…

Highly significant improvement of protein sequence alignments with AlphaFold2

MOTIVATION: Protein sequence alignments are essential to structural, evolutionary and functional analysis, but their accuracy is often limited by sequence similarity unless molecular structures are available. Protein structures predicted at experimental grade accuracy, as achieved by AlphaFold2, cou...

Descripción completa

Detalles Bibliográficos
Autores principales: Baltzis, Athanasios, Mansouri, Leila, Jin, Suzanne, Langer, Björn E, Erb, Ionas, Notredame, Cedric
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9665868/
https://www.ncbi.nlm.nih.gov/pubmed/36130276
http://dx.doi.org/10.1093/bioinformatics/btac625
_version_ 1784831380409024512
author Baltzis, Athanasios
Mansouri, Leila
Jin, Suzanne
Langer, Björn E
Erb, Ionas
Notredame, Cedric
author_facet Baltzis, Athanasios
Mansouri, Leila
Jin, Suzanne
Langer, Björn E
Erb, Ionas
Notredame, Cedric
author_sort Baltzis, Athanasios
collection PubMed
description MOTIVATION: Protein sequence alignments are essential to structural, evolutionary and functional analysis, but their accuracy is often limited by sequence similarity unless molecular structures are available. Protein structures predicted at experimental grade accuracy, as achieved by AlphaFold2, could therefore have a major impact on sequence analysis. RESULTS: Here, we find that multiple sequence alignments estimated on AlphaFold2 predictions are almost as accurate as alignments estimated on experimental structures and significantly closer to the structural reference than sequence-based alignments. We also show that AlphaFold2 structural models of relatively low quality can be used to obtain highly accurate alignments. These results suggest that, besides structure modeling, AlphaFold2 encodes higher-order dependencies that can be exploited for sequence analysis. AVAILABILITY AND IMPLEMENTATION: All data, analyses and results are available on Zenodo (https://doi.org/10.5281/zenodo.7031286). The code and scripts have been deposited in GitHub (https://github.com/cbcrg/msa-af2-nf) and the various containers in (https://cloud.sylabs.io/library/athbaltzis/af2/alphafold, https://hub.docker.com/r/athbaltzis/pred). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-9665868
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-96658682022-11-16 Highly significant improvement of protein sequence alignments with AlphaFold2 Baltzis, Athanasios Mansouri, Leila Jin, Suzanne Langer, Björn E Erb, Ionas Notredame, Cedric Bioinformatics Original Papers MOTIVATION: Protein sequence alignments are essential to structural, evolutionary and functional analysis, but their accuracy is often limited by sequence similarity unless molecular structures are available. Protein structures predicted at experimental grade accuracy, as achieved by AlphaFold2, could therefore have a major impact on sequence analysis. RESULTS: Here, we find that multiple sequence alignments estimated on AlphaFold2 predictions are almost as accurate as alignments estimated on experimental structures and significantly closer to the structural reference than sequence-based alignments. We also show that AlphaFold2 structural models of relatively low quality can be used to obtain highly accurate alignments. These results suggest that, besides structure modeling, AlphaFold2 encodes higher-order dependencies that can be exploited for sequence analysis. AVAILABILITY AND IMPLEMENTATION: All data, analyses and results are available on Zenodo (https://doi.org/10.5281/zenodo.7031286). The code and scripts have been deposited in GitHub (https://github.com/cbcrg/msa-af2-nf) and the various containers in (https://cloud.sylabs.io/library/athbaltzis/af2/alphafold, https://hub.docker.com/r/athbaltzis/pred). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2022-09-21 /pmc/articles/PMC9665868/ /pubmed/36130276 http://dx.doi.org/10.1093/bioinformatics/btac625 Text en © The Author(s) 2022. Published by Oxford University Press. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Original Papers
Baltzis, Athanasios
Mansouri, Leila
Jin, Suzanne
Langer, Björn E
Erb, Ionas
Notredame, Cedric
Highly significant improvement of protein sequence alignments with AlphaFold2
title Highly significant improvement of protein sequence alignments with AlphaFold2
title_full Highly significant improvement of protein sequence alignments with AlphaFold2
title_fullStr Highly significant improvement of protein sequence alignments with AlphaFold2
title_full_unstemmed Highly significant improvement of protein sequence alignments with AlphaFold2
title_short Highly significant improvement of protein sequence alignments with AlphaFold2
title_sort highly significant improvement of protein sequence alignments with alphafold2
topic Original Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9665868/
https://www.ncbi.nlm.nih.gov/pubmed/36130276
http://dx.doi.org/10.1093/bioinformatics/btac625
work_keys_str_mv AT baltzisathanasios highlysignificantimprovementofproteinsequencealignmentswithalphafold2
AT mansourileila highlysignificantimprovementofproteinsequencealignmentswithalphafold2
AT jinsuzanne highlysignificantimprovementofproteinsequencealignmentswithalphafold2
AT langerbjorne highlysignificantimprovementofproteinsequencealignmentswithalphafold2
AT erbionas highlysignificantimprovementofproteinsequencealignmentswithalphafold2
AT notredamecedric highlysignificantimprovementofproteinsequencealignmentswithalphafold2