Cargando…

On the use of direct-coupling analysis with a reduced alphabet of amino acids combined with super-secondary structure motifs for protein fold prediction

Direct-coupling analysis (DCA) for studying the coevolution of residues in proteins has been widely used to predict the three-dimensional structure of a protein from its sequence. We present RADI/raDIMod, a variation of the original DCA algorithm that groups chemically equivalent residues combined w...

Descripción completa

Detalles Bibliográficos
Autores principales: Anton, Bernat, Besalú, Mireia, Fornes, Oriol, Bonet, Jaume, Molina, Alexis, Molina-Fernandez, Ruben, De las Cuevas, Gemma, Fernandez-Fuentes, Narcis, Oliva, Baldo
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8061457/
https://www.ncbi.nlm.nih.gov/pubmed/33937764
http://dx.doi.org/10.1093/nargab/lqab027
_version_ 1783681570248327168
author Anton, Bernat
Besalú, Mireia
Fornes, Oriol
Bonet, Jaume
Molina, Alexis
Molina-Fernandez, Ruben
De las Cuevas, Gemma
Fernandez-Fuentes, Narcis
Oliva, Baldo
author_facet Anton, Bernat
Besalú, Mireia
Fornes, Oriol
Bonet, Jaume
Molina, Alexis
Molina-Fernandez, Ruben
De las Cuevas, Gemma
Fernandez-Fuentes, Narcis
Oliva, Baldo
author_sort Anton, Bernat
collection PubMed
description Direct-coupling analysis (DCA) for studying the coevolution of residues in proteins has been widely used to predict the three-dimensional structure of a protein from its sequence. We present RADI/raDIMod, a variation of the original DCA algorithm that groups chemically equivalent residues combined with super-secondary structure motifs to model protein structures. Interestingly, the simplification produced by grouping amino acids into only two groups (polar and non-polar) is still representative of the physicochemical nature that characterizes the protein structure and it is in line with the role of hydrophobic forces in protein-folding funneling. As a result of a compressed alphabet, the number of sequences required for the multiple sequence alignment is reduced. The number of long-range contacts predicted is limited; therefore, our approach requires the use of neighboring sequence-positions. We use the prediction of secondary structure and motifs of super-secondary structures to predict local contacts. We use RADI and raDIMod, a fragment-based protein structure modelling, achieving near native conformations when the number of super-secondary motifs covers >30–50% of the sequence. Interestingly, although different contacts are predicted with different alphabets, they produce similar structures.
format Online
Article
Text
id pubmed-8061457
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-80614572021-04-29 On the use of direct-coupling analysis with a reduced alphabet of amino acids combined with super-secondary structure motifs for protein fold prediction Anton, Bernat Besalú, Mireia Fornes, Oriol Bonet, Jaume Molina, Alexis Molina-Fernandez, Ruben De las Cuevas, Gemma Fernandez-Fuentes, Narcis Oliva, Baldo NAR Genom Bioinform Methart Direct-coupling analysis (DCA) for studying the coevolution of residues in proteins has been widely used to predict the three-dimensional structure of a protein from its sequence. We present RADI/raDIMod, a variation of the original DCA algorithm that groups chemically equivalent residues combined with super-secondary structure motifs to model protein structures. Interestingly, the simplification produced by grouping amino acids into only two groups (polar and non-polar) is still representative of the physicochemical nature that characterizes the protein structure and it is in line with the role of hydrophobic forces in protein-folding funneling. As a result of a compressed alphabet, the number of sequences required for the multiple sequence alignment is reduced. The number of long-range contacts predicted is limited; therefore, our approach requires the use of neighboring sequence-positions. We use the prediction of secondary structure and motifs of super-secondary structures to predict local contacts. We use RADI and raDIMod, a fragment-based protein structure modelling, achieving near native conformations when the number of super-secondary motifs covers >30–50% of the sequence. Interestingly, although different contacts are predicted with different alphabets, they produce similar structures. Oxford University Press 2021-04-22 /pmc/articles/PMC8061457/ /pubmed/33937764 http://dx.doi.org/10.1093/nargab/lqab027 Text en © The Author(s) 2021. Published by Oxford University Press on behalf of NAR Genomics and Bioinformatics. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (http://creativecommons.org/licenses/by-nc/4.0/ (https://creativecommons.org/licenses/by-nc/4.0/) ), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Methart
Anton, Bernat
Besalú, Mireia
Fornes, Oriol
Bonet, Jaume
Molina, Alexis
Molina-Fernandez, Ruben
De las Cuevas, Gemma
Fernandez-Fuentes, Narcis
Oliva, Baldo
On the use of direct-coupling analysis with a reduced alphabet of amino acids combined with super-secondary structure motifs for protein fold prediction
title On the use of direct-coupling analysis with a reduced alphabet of amino acids combined with super-secondary structure motifs for protein fold prediction
title_full On the use of direct-coupling analysis with a reduced alphabet of amino acids combined with super-secondary structure motifs for protein fold prediction
title_fullStr On the use of direct-coupling analysis with a reduced alphabet of amino acids combined with super-secondary structure motifs for protein fold prediction
title_full_unstemmed On the use of direct-coupling analysis with a reduced alphabet of amino acids combined with super-secondary structure motifs for protein fold prediction
title_short On the use of direct-coupling analysis with a reduced alphabet of amino acids combined with super-secondary structure motifs for protein fold prediction
title_sort on the use of direct-coupling analysis with a reduced alphabet of amino acids combined with super-secondary structure motifs for protein fold prediction
topic Methart
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8061457/
https://www.ncbi.nlm.nih.gov/pubmed/33937764
http://dx.doi.org/10.1093/nargab/lqab027
work_keys_str_mv AT antonbernat ontheuseofdirectcouplinganalysiswithareducedalphabetofaminoacidscombinedwithsupersecondarystructuremotifsforproteinfoldprediction
AT besalumireia ontheuseofdirectcouplinganalysiswithareducedalphabetofaminoacidscombinedwithsupersecondarystructuremotifsforproteinfoldprediction
AT fornesoriol ontheuseofdirectcouplinganalysiswithareducedalphabetofaminoacidscombinedwithsupersecondarystructuremotifsforproteinfoldprediction
AT bonetjaume ontheuseofdirectcouplinganalysiswithareducedalphabetofaminoacidscombinedwithsupersecondarystructuremotifsforproteinfoldprediction
AT molinaalexis ontheuseofdirectcouplinganalysiswithareducedalphabetofaminoacidscombinedwithsupersecondarystructuremotifsforproteinfoldprediction
AT molinafernandezruben ontheuseofdirectcouplinganalysiswithareducedalphabetofaminoacidscombinedwithsupersecondarystructuremotifsforproteinfoldprediction
AT delascuevasgemma ontheuseofdirectcouplinganalysiswithareducedalphabetofaminoacidscombinedwithsupersecondarystructuremotifsforproteinfoldprediction
AT fernandezfuentesnarcis ontheuseofdirectcouplinganalysiswithareducedalphabetofaminoacidscombinedwithsupersecondarystructuremotifsforproteinfoldprediction
AT olivabaldo ontheuseofdirectcouplinganalysiswithareducedalphabetofaminoacidscombinedwithsupersecondarystructuremotifsforproteinfoldprediction