Cargando…
Analysis of several key factors influencing deep learning-based inter-residue contact prediction
MOTIVATION: Deep learning has become the dominant technology for protein contact prediction. However, the factors that affect the performance of deep learning in contact prediction have not been systematically investigated. RESULTS: We analyzed the results of our three deep learning-based contact pr...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7703788/ https://www.ncbi.nlm.nih.gov/pubmed/31504181 http://dx.doi.org/10.1093/bioinformatics/btz679 |
_version_ | 1783616696671535104 |
---|---|
author | Wu, Tianqi Hou, Jie Adhikari, Badri Cheng, Jianlin |
author_facet | Wu, Tianqi Hou, Jie Adhikari, Badri Cheng, Jianlin |
author_sort | Wu, Tianqi |
collection | PubMed |
description | MOTIVATION: Deep learning has become the dominant technology for protein contact prediction. However, the factors that affect the performance of deep learning in contact prediction have not been systematically investigated. RESULTS: We analyzed the results of our three deep learning-based contact prediction methods (MULTICOM-CLUSTER, MULTICOM-CONSTRUCT and MULTICOM-NOVEL) in the CASP13 experiment and identified several key factors [i.e. deep learning technique, multiple sequence alignment (MSA), distance distribution prediction and domain-based contact integration] that influenced the contact prediction accuracy. We compared our convolutional neural network (CNN)-based contact prediction methods with three coevolution-based methods on 75 CASP13 targets consisting of 108 domains. We demonstrated that the CNN-based multi-distance approach was able to leverage global coevolutionary coupling patterns comprised of multiple correlated contacts for more accurate contact prediction than the local coevolution-based methods, leading to a substantial increase of precision by 19.2 percentage points. We also tested different alignment methods and domain-based contact prediction with the deep learning contact predictors. The comparison of the three methods showed deeper sequence alignments and the integration of domain-based contact prediction with the full-length contact prediction improved the performance of contact prediction. Moreover, we demonstrated that the domain-based contact prediction based on a novel ab initio approach of parsing domains from MSAs alone without using known protein structures was a simple, fast approach to improve contact prediction. Finally, we showed that predicting the distribution of inter-residue distances in multiple distance intervals could capture more structural information and improve binary contact prediction. AVAILABILITY AND IMPLEMENTATION: https://github.com/multicom-toolbox/DNCON2/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. |
format | Online Article Text |
id | pubmed-7703788 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-77037882020-12-07 Analysis of several key factors influencing deep learning-based inter-residue contact prediction Wu, Tianqi Hou, Jie Adhikari, Badri Cheng, Jianlin Bioinformatics Original Papers MOTIVATION: Deep learning has become the dominant technology for protein contact prediction. However, the factors that affect the performance of deep learning in contact prediction have not been systematically investigated. RESULTS: We analyzed the results of our three deep learning-based contact prediction methods (MULTICOM-CLUSTER, MULTICOM-CONSTRUCT and MULTICOM-NOVEL) in the CASP13 experiment and identified several key factors [i.e. deep learning technique, multiple sequence alignment (MSA), distance distribution prediction and domain-based contact integration] that influenced the contact prediction accuracy. We compared our convolutional neural network (CNN)-based contact prediction methods with three coevolution-based methods on 75 CASP13 targets consisting of 108 domains. We demonstrated that the CNN-based multi-distance approach was able to leverage global coevolutionary coupling patterns comprised of multiple correlated contacts for more accurate contact prediction than the local coevolution-based methods, leading to a substantial increase of precision by 19.2 percentage points. We also tested different alignment methods and domain-based contact prediction with the deep learning contact predictors. The comparison of the three methods showed deeper sequence alignments and the integration of domain-based contact prediction with the full-length contact prediction improved the performance of contact prediction. Moreover, we demonstrated that the domain-based contact prediction based on a novel ab initio approach of parsing domains from MSAs alone without using known protein structures was a simple, fast approach to improve contact prediction. Finally, we showed that predicting the distribution of inter-residue distances in multiple distance intervals could capture more structural information and improve binary contact prediction. AVAILABILITY AND IMPLEMENTATION: https://github.com/multicom-toolbox/DNCON2/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2019-08-30 /pmc/articles/PMC7703788/ /pubmed/31504181 http://dx.doi.org/10.1093/bioinformatics/btz679 Text en © The Author(s) 2019. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) ), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Original Papers Wu, Tianqi Hou, Jie Adhikari, Badri Cheng, Jianlin Analysis of several key factors influencing deep learning-based inter-residue contact prediction |
title | Analysis of several key factors influencing deep learning-based inter-residue contact prediction |
title_full | Analysis of several key factors influencing deep learning-based inter-residue contact prediction |
title_fullStr | Analysis of several key factors influencing deep learning-based inter-residue contact prediction |
title_full_unstemmed | Analysis of several key factors influencing deep learning-based inter-residue contact prediction |
title_short | Analysis of several key factors influencing deep learning-based inter-residue contact prediction |
title_sort | analysis of several key factors influencing deep learning-based inter-residue contact prediction |
topic | Original Papers |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7703788/ https://www.ncbi.nlm.nih.gov/pubmed/31504181 http://dx.doi.org/10.1093/bioinformatics/btz679 |
work_keys_str_mv | AT wutianqi analysisofseveralkeyfactorsinfluencingdeeplearningbasedinterresiduecontactprediction AT houjie analysisofseveralkeyfactorsinfluencingdeeplearningbasedinterresiduecontactprediction AT adhikaribadri analysisofseveralkeyfactorsinfluencingdeeplearningbasedinterresiduecontactprediction AT chengjianlin analysisofseveralkeyfactorsinfluencingdeeplearningbasedinterresiduecontactprediction |