Cargando…

FilterDCA: Interpretable supervised contact prediction using inter-domain coevolution

Predicting three-dimensional protein structure and assembling protein complexes using sequence information belongs to the most prominent tasks in computational biology. Recently substantial progress has been obtained in the case of single proteins using a combination of unsupervised coevolutionary s...

Descripción completa

Detalles Bibliográficos
Autores principales: Muscat, Maureen, Croce, Giancarlo, Sarti, Edoardo, Weigt, Martin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7577475/
https://www.ncbi.nlm.nih.gov/pubmed/33035205
http://dx.doi.org/10.1371/journal.pcbi.1007621
_version_ 1783598197899264000
author Muscat, Maureen
Croce, Giancarlo
Sarti, Edoardo
Weigt, Martin
author_facet Muscat, Maureen
Croce, Giancarlo
Sarti, Edoardo
Weigt, Martin
author_sort Muscat, Maureen
collection PubMed
description Predicting three-dimensional protein structure and assembling protein complexes using sequence information belongs to the most prominent tasks in computational biology. Recently substantial progress has been obtained in the case of single proteins using a combination of unsupervised coevolutionary sequence analysis with structurally supervised deep learning. While reaching impressive accuracies in predicting residue-residue contacts, deep learning has a number of disadvantages. The need for large structural training sets limits the applicability to multi-protein complexes; and their deep architecture makes the interpretability of the convolutional neural networks intrinsically hard. Here we introduce FilterDCA, a simpler supervised predictor for inter-domain and inter-protein contacts. It is based on the fact that contact maps of proteins show typical contact patterns, which results from secondary structure and are reflected by patterns in coevolutionary analysis. We explicitly integrate averaged contacts patterns with coevolutionary scores derived by Direct Coupling Analysis, improving performance over standard coevolutionary analysis, while remaining fully transparent and interpretable. The FilterDCA code is available at http://gitlab.lcqb.upmc.fr/muscat/FilterDCA.
format Online
Article
Text
id pubmed-7577475
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-75774752020-10-26 FilterDCA: Interpretable supervised contact prediction using inter-domain coevolution Muscat, Maureen Croce, Giancarlo Sarti, Edoardo Weigt, Martin PLoS Comput Biol Research Article Predicting three-dimensional protein structure and assembling protein complexes using sequence information belongs to the most prominent tasks in computational biology. Recently substantial progress has been obtained in the case of single proteins using a combination of unsupervised coevolutionary sequence analysis with structurally supervised deep learning. While reaching impressive accuracies in predicting residue-residue contacts, deep learning has a number of disadvantages. The need for large structural training sets limits the applicability to multi-protein complexes; and their deep architecture makes the interpretability of the convolutional neural networks intrinsically hard. Here we introduce FilterDCA, a simpler supervised predictor for inter-domain and inter-protein contacts. It is based on the fact that contact maps of proteins show typical contact patterns, which results from secondary structure and are reflected by patterns in coevolutionary analysis. We explicitly integrate averaged contacts patterns with coevolutionary scores derived by Direct Coupling Analysis, improving performance over standard coevolutionary analysis, while remaining fully transparent and interpretable. The FilterDCA code is available at http://gitlab.lcqb.upmc.fr/muscat/FilterDCA. Public Library of Science 2020-10-09 /pmc/articles/PMC7577475/ /pubmed/33035205 http://dx.doi.org/10.1371/journal.pcbi.1007621 Text en © 2020 Muscat et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Muscat, Maureen
Croce, Giancarlo
Sarti, Edoardo
Weigt, Martin
FilterDCA: Interpretable supervised contact prediction using inter-domain coevolution
title FilterDCA: Interpretable supervised contact prediction using inter-domain coevolution
title_full FilterDCA: Interpretable supervised contact prediction using inter-domain coevolution
title_fullStr FilterDCA: Interpretable supervised contact prediction using inter-domain coevolution
title_full_unstemmed FilterDCA: Interpretable supervised contact prediction using inter-domain coevolution
title_short FilterDCA: Interpretable supervised contact prediction using inter-domain coevolution
title_sort filterdca: interpretable supervised contact prediction using inter-domain coevolution
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7577475/
https://www.ncbi.nlm.nih.gov/pubmed/33035205
http://dx.doi.org/10.1371/journal.pcbi.1007621
work_keys_str_mv AT muscatmaureen filterdcainterpretablesupervisedcontactpredictionusinginterdomaincoevolution
AT crocegiancarlo filterdcainterpretablesupervisedcontactpredictionusinginterdomaincoevolution
AT sartiedoardo filterdcainterpretablesupervisedcontactpredictionusinginterdomaincoevolution
AT weigtmartin filterdcainterpretablesupervisedcontactpredictionusinginterdomaincoevolution