Cargando…

Analysis of several key factors influencing deep learning-based inter-residue contact prediction

MOTIVATION: Deep learning has become the dominant technology for protein contact prediction. However, the factors that affect the performance of deep learning in contact prediction have not been systematically investigated. RESULTS: We analyzed the results of our three deep learning-based contact pr...

Descripción completa

Detalles Bibliográficos
Autores principales:	Wu, Tianqi, Hou, Jie, Adhikari, Badri, Cheng, Jianlin
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Oxford University Press 2019
Materias:	Original Papers
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7703788/ https://www.ncbi.nlm.nih.gov/pubmed/31504181 http://dx.doi.org/10.1093/bioinformatics/btz679

_version_	1783616696671535104
author	Wu, Tianqi Hou, Jie Adhikari, Badri Cheng, Jianlin
author_facet	Wu, Tianqi Hou, Jie Adhikari, Badri Cheng, Jianlin
author_sort	Wu, Tianqi
collection	PubMed
description	MOTIVATION: Deep learning has become the dominant technology for protein contact prediction. However, the factors that affect the performance of deep learning in contact prediction have not been systematically investigated. RESULTS: We analyzed the results of our three deep learning-based contact prediction methods (MULTICOM-CLUSTER, MULTICOM-CONSTRUCT and MULTICOM-NOVEL) in the CASP13 experiment and identified several key factors [i.e. deep learning technique, multiple sequence alignment (MSA), distance distribution prediction and domain-based contact integration] that influenced the contact prediction accuracy. We compared our convolutional neural network (CNN)-based contact prediction methods with three coevolution-based methods on 75 CASP13 targets consisting of 108 domains. We demonstrated that the CNN-based multi-distance approach was able to leverage global coevolutionary coupling patterns comprised of multiple correlated contacts for more accurate contact prediction than the local coevolution-based methods, leading to a substantial increase of precision by 19.2 percentage points. We also tested different alignment methods and domain-based contact prediction with the deep learning contact predictors. The comparison of the three methods showed deeper sequence alignments and the integration of domain-based contact prediction with the full-length contact prediction improved the performance of contact prediction. Moreover, we demonstrated that the domain-based contact prediction based on a novel ab initio approach of parsing domains from MSAs alone without using known protein structures was a simple, fast approach to improve contact prediction. Finally, we showed that predicting the distribution of inter-residue distances in multiple distance intervals could capture more structural information and improve binary contact prediction. AVAILABILITY AND IMPLEMENTATION: https://github.com/multicom-toolbox/DNCON2/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format	Online Article Text
id	pubmed-7703788
institution	National Center for Biotechnology Information
language	English
publishDate	2019
publisher	Oxford University Press
record_format	MEDLINE/PubMed
spelling	pubmed-77037882020-12-07 Analysis of several key factors influencing deep learning-based inter-residue contact prediction Wu, Tianqi Hou, Jie Adhikari, Badri Cheng, Jianlin Bioinformatics Original Papers MOTIVATION: Deep learning has become the dominant technology for protein contact prediction. However, the factors that affect the performance of deep learning in contact prediction have not been systematically investigated. RESULTS: We analyzed the results of our three deep learning-based contact prediction methods (MULTICOM-CLUSTER, MULTICOM-CONSTRUCT and MULTICOM-NOVEL) in the CASP13 experiment and identified several key factors [i.e. deep learning technique, multiple sequence alignment (MSA), distance distribution prediction and domain-based contact integration] that influenced the contact prediction accuracy. We compared our convolutional neural network (CNN)-based contact prediction methods with three coevolution-based methods on 75 CASP13 targets consisting of 108 domains. We demonstrated that the CNN-based multi-distance approach was able to leverage global coevolutionary coupling patterns comprised of multiple correlated contacts for more accurate contact prediction than the local coevolution-based methods, leading to a substantial increase of precision by 19.2 percentage points. We also tested different alignment methods and domain-based contact prediction with the deep learning contact predictors. The comparison of the three methods showed deeper sequence alignments and the integration of domain-based contact prediction with the full-length contact prediction improved the performance of contact prediction. Moreover, we demonstrated that the domain-based contact prediction based on a novel ab initio approach of parsing domains from MSAs alone without using known protein structures was a simple, fast approach to improve contact prediction. Finally, we showed that predicting the distribution of inter-residue distances in multiple distance intervals could capture more structural information and improve binary contact prediction. AVAILABILITY AND IMPLEMENTATION: https://github.com/multicom-toolbox/DNCON2/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2019-08-30 /pmc/articles/PMC7703788/ /pubmed/31504181 http://dx.doi.org/10.1093/bioinformatics/btz679 Text en © The Author(s) 2019. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) ), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Original Papers Wu, Tianqi Hou, Jie Adhikari, Badri Cheng, Jianlin Analysis of several key factors influencing deep learning-based inter-residue contact prediction
title	Analysis of several key factors influencing deep learning-based inter-residue contact prediction
title_full	Analysis of several key factors influencing deep learning-based inter-residue contact prediction
title_fullStr	Analysis of several key factors influencing deep learning-based inter-residue contact prediction
title_full_unstemmed	Analysis of several key factors influencing deep learning-based inter-residue contact prediction
title_short	Analysis of several key factors influencing deep learning-based inter-residue contact prediction
title_sort	analysis of several key factors influencing deep learning-based inter-residue contact prediction
topic	Original Papers
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7703788/ https://www.ncbi.nlm.nih.gov/pubmed/31504181 http://dx.doi.org/10.1093/bioinformatics/btz679
work_keys_str_mv	AT wutianqi analysisofseveralkeyfactorsinfluencingdeeplearningbasedinterresiduecontactprediction AT houjie analysisofseveralkeyfactorsinfluencingdeeplearningbasedinterresiduecontactprediction AT adhikaribadri analysisofseveralkeyfactorsinfluencingdeeplearningbasedinterresiduecontactprediction AT chengjianlin analysisofseveralkeyfactorsinfluencingdeeplearningbasedinterresiduecontactprediction

Analysis of several key factors influencing deep learning-based inter-residue contact prediction

Ejemplares similares