Cargando…

Reciprocal best structure hits: using AlphaFold models to discover distant homologues

MOTIVATION: The conventional methods to detect homologous protein pairs use the comparison of protein sequences. But the sequences of two homologous proteins may diverge significantly and consequently may be undetectable by standard approaches. The release of the AlphaFold 2.0 software enables the p...

Descripción completa

Detalles Bibliográficos
Autores principales: Monzon, Vivian, Paysan-Lafosse, Typhaine, Wood, Valerie, Bateman, Alex
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9666668/
https://www.ncbi.nlm.nih.gov/pubmed/36408459
http://dx.doi.org/10.1093/bioadv/vbac072
_version_ 1784831559680917504
author Monzon, Vivian
Paysan-Lafosse, Typhaine
Wood, Valerie
Bateman, Alex
author_facet Monzon, Vivian
Paysan-Lafosse, Typhaine
Wood, Valerie
Bateman, Alex
author_sort Monzon, Vivian
collection PubMed
description MOTIVATION: The conventional methods to detect homologous protein pairs use the comparison of protein sequences. But the sequences of two homologous proteins may diverge significantly and consequently may be undetectable by standard approaches. The release of the AlphaFold 2.0 software enables the prediction of highly accurate protein structures and opens many opportunities to advance our understanding of protein functions, including the detection of homologous protein structure pairs. RESULTS: In this proof-of-concept work, we search for the closest homologous protein pairs using the structure models of five model organisms from the AlphaFold database. We compare the results with homologous protein pairs detected by their sequence similarity and show that the structural matching approach finds a similar set of results. In addition, we detect potential novel homologs solely with the structural matching approach, which can help to understand the function of uncharacterized proteins and make previously overlooked connections between well-characterized proteins. We also observe limitations of our implementation of the structure-based approach, particularly when handling highly disordered proteins or short protein structures. Our work shows that high accuracy protein structure models can be used to discover homologous protein pairs, and we expose areas for improvement of this structural matching approach. AVAILABILITY AND IMPLEMENTATION: Information to the discovered homologous protein pairs can be found at the following URL: https://doi.org/10.17863/CAM.87873. The code can be accessed here: https://github.com/VivianMonzon/Reciprocal_Best_Structure_Hits. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics Advances online.
format Online
Article
Text
id pubmed-9666668
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-96666682022-11-16 Reciprocal best structure hits: using AlphaFold models to discover distant homologues Monzon, Vivian Paysan-Lafosse, Typhaine Wood, Valerie Bateman, Alex Bioinform Adv Original Paper MOTIVATION: The conventional methods to detect homologous protein pairs use the comparison of protein sequences. But the sequences of two homologous proteins may diverge significantly and consequently may be undetectable by standard approaches. The release of the AlphaFold 2.0 software enables the prediction of highly accurate protein structures and opens many opportunities to advance our understanding of protein functions, including the detection of homologous protein structure pairs. RESULTS: In this proof-of-concept work, we search for the closest homologous protein pairs using the structure models of five model organisms from the AlphaFold database. We compare the results with homologous protein pairs detected by their sequence similarity and show that the structural matching approach finds a similar set of results. In addition, we detect potential novel homologs solely with the structural matching approach, which can help to understand the function of uncharacterized proteins and make previously overlooked connections between well-characterized proteins. We also observe limitations of our implementation of the structure-based approach, particularly when handling highly disordered proteins or short protein structures. Our work shows that high accuracy protein structure models can be used to discover homologous protein pairs, and we expose areas for improvement of this structural matching approach. AVAILABILITY AND IMPLEMENTATION: Information to the discovered homologous protein pairs can be found at the following URL: https://doi.org/10.17863/CAM.87873. The code can be accessed here: https://github.com/VivianMonzon/Reciprocal_Best_Structure_Hits. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics Advances online. Oxford University Press 2022-10-06 /pmc/articles/PMC9666668/ /pubmed/36408459 http://dx.doi.org/10.1093/bioadv/vbac072 Text en © The Author(s) 2022. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Paper
Monzon, Vivian
Paysan-Lafosse, Typhaine
Wood, Valerie
Bateman, Alex
Reciprocal best structure hits: using AlphaFold models to discover distant homologues
title Reciprocal best structure hits: using AlphaFold models to discover distant homologues
title_full Reciprocal best structure hits: using AlphaFold models to discover distant homologues
title_fullStr Reciprocal best structure hits: using AlphaFold models to discover distant homologues
title_full_unstemmed Reciprocal best structure hits: using AlphaFold models to discover distant homologues
title_short Reciprocal best structure hits: using AlphaFold models to discover distant homologues
title_sort reciprocal best structure hits: using alphafold models to discover distant homologues
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9666668/
https://www.ncbi.nlm.nih.gov/pubmed/36408459
http://dx.doi.org/10.1093/bioadv/vbac072
work_keys_str_mv AT monzonvivian reciprocalbeststructurehitsusingalphafoldmodelstodiscoverdistanthomologues
AT paysanlafossetyphaine reciprocalbeststructurehitsusingalphafoldmodelstodiscoverdistanthomologues
AT woodvalerie reciprocalbeststructurehitsusingalphafoldmodelstodiscoverdistanthomologues
AT batemanalex reciprocalbeststructurehitsusingalphafoldmodelstodiscoverdistanthomologues