Cargando…

Evaluation of residue-residue contact prediction methods: From retrospective to prospective

Sequence-based residue contact prediction plays a crucial role in protein structure reconstruction. In recent years, the combination of evolutionary coupling analysis (ECA) and deep learning (DL) techniques has made tremendous progress for residue contact prediction, thus a comprehensive assessment...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Huiling, Bei, Zhendong, Xi, Wenhui, Hao, Min, Ju, Zhen, Saravanan, Konda Mani, Zhang, Haiping, Guo, Ning, Wei, Yanjie
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8177648/
https://www.ncbi.nlm.nih.gov/pubmed/34029314
http://dx.doi.org/10.1371/journal.pcbi.1009027
_version_ 1783703426741305344
author Zhang, Huiling
Bei, Zhendong
Xi, Wenhui
Hao, Min
Ju, Zhen
Saravanan, Konda Mani
Zhang, Haiping
Guo, Ning
Wei, Yanjie
author_facet Zhang, Huiling
Bei, Zhendong
Xi, Wenhui
Hao, Min
Ju, Zhen
Saravanan, Konda Mani
Zhang, Haiping
Guo, Ning
Wei, Yanjie
author_sort Zhang, Huiling
collection PubMed
description Sequence-based residue contact prediction plays a crucial role in protein structure reconstruction. In recent years, the combination of evolutionary coupling analysis (ECA) and deep learning (DL) techniques has made tremendous progress for residue contact prediction, thus a comprehensive assessment of current methods based on a large-scale benchmark data set is very needed. In this study, we evaluate 18 contact predictors on 610 non-redundant proteins and 32 CASP13 targets according to a wide range of perspectives. The results show that different methods have different application scenarios: (1) DL methods based on multi-categories of inputs and large training sets are the best choices for low-contact-density proteins such as the intrinsically disordered ones and proteins with shallow multi-sequence alignments (MSAs). (2) With at least 5L (L is sequence length) effective sequences in the MSA, all the methods show the best performance, and methods that rely only on MSA as input can reach comparable achievements as methods that adopt multi-source inputs. (3) For top L/5 and L/2 predictions, DL methods can predict more hydrophobic interactions while ECA methods predict more salt bridges and disulfide bonds. (4) ECA methods can detect more secondary structure interactions, while DL methods can accurately excavate more contact patterns and prune isolated false positives. In general, multi-input DL methods with large training sets dominate current approaches with the best overall performance. Despite the great success of current DL methods must be stated the fact that there is still much room left for further improvement: (1) With shallow MSAs, the performance will be greatly affected. (2) Current methods show lower precisions for inter-domain compared with intra-domain contact predictions, as well as very high imbalances in precisions between intra-domains. (3) Strong prediction similarities between DL methods indicating more feature types and diversified models need to be developed. (4) The runtime of most methods can be further optimized.
format Online
Article
Text
id pubmed-8177648
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-81776482021-06-07 Evaluation of residue-residue contact prediction methods: From retrospective to prospective Zhang, Huiling Bei, Zhendong Xi, Wenhui Hao, Min Ju, Zhen Saravanan, Konda Mani Zhang, Haiping Guo, Ning Wei, Yanjie PLoS Comput Biol Research Article Sequence-based residue contact prediction plays a crucial role in protein structure reconstruction. In recent years, the combination of evolutionary coupling analysis (ECA) and deep learning (DL) techniques has made tremendous progress for residue contact prediction, thus a comprehensive assessment of current methods based on a large-scale benchmark data set is very needed. In this study, we evaluate 18 contact predictors on 610 non-redundant proteins and 32 CASP13 targets according to a wide range of perspectives. The results show that different methods have different application scenarios: (1) DL methods based on multi-categories of inputs and large training sets are the best choices for low-contact-density proteins such as the intrinsically disordered ones and proteins with shallow multi-sequence alignments (MSAs). (2) With at least 5L (L is sequence length) effective sequences in the MSA, all the methods show the best performance, and methods that rely only on MSA as input can reach comparable achievements as methods that adopt multi-source inputs. (3) For top L/5 and L/2 predictions, DL methods can predict more hydrophobic interactions while ECA methods predict more salt bridges and disulfide bonds. (4) ECA methods can detect more secondary structure interactions, while DL methods can accurately excavate more contact patterns and prune isolated false positives. In general, multi-input DL methods with large training sets dominate current approaches with the best overall performance. Despite the great success of current DL methods must be stated the fact that there is still much room left for further improvement: (1) With shallow MSAs, the performance will be greatly affected. (2) Current methods show lower precisions for inter-domain compared with intra-domain contact predictions, as well as very high imbalances in precisions between intra-domains. (3) Strong prediction similarities between DL methods indicating more feature types and diversified models need to be developed. (4) The runtime of most methods can be further optimized. Public Library of Science 2021-05-24 /pmc/articles/PMC8177648/ /pubmed/34029314 http://dx.doi.org/10.1371/journal.pcbi.1009027 Text en © 2021 Zhang et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Zhang, Huiling
Bei, Zhendong
Xi, Wenhui
Hao, Min
Ju, Zhen
Saravanan, Konda Mani
Zhang, Haiping
Guo, Ning
Wei, Yanjie
Evaluation of residue-residue contact prediction methods: From retrospective to prospective
title Evaluation of residue-residue contact prediction methods: From retrospective to prospective
title_full Evaluation of residue-residue contact prediction methods: From retrospective to prospective
title_fullStr Evaluation of residue-residue contact prediction methods: From retrospective to prospective
title_full_unstemmed Evaluation of residue-residue contact prediction methods: From retrospective to prospective
title_short Evaluation of residue-residue contact prediction methods: From retrospective to prospective
title_sort evaluation of residue-residue contact prediction methods: from retrospective to prospective
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8177648/
https://www.ncbi.nlm.nih.gov/pubmed/34029314
http://dx.doi.org/10.1371/journal.pcbi.1009027
work_keys_str_mv AT zhanghuiling evaluationofresidueresiduecontactpredictionmethodsfromretrospectivetoprospective
AT beizhendong evaluationofresidueresiduecontactpredictionmethodsfromretrospectivetoprospective
AT xiwenhui evaluationofresidueresiduecontactpredictionmethodsfromretrospectivetoprospective
AT haomin evaluationofresidueresiduecontactpredictionmethodsfromretrospectivetoprospective
AT juzhen evaluationofresidueresiduecontactpredictionmethodsfromretrospectivetoprospective
AT saravanankondamani evaluationofresidueresiduecontactpredictionmethodsfromretrospectivetoprospective
AT zhanghaiping evaluationofresidueresiduecontactpredictionmethodsfromretrospectivetoprospective
AT guoning evaluationofresidueresiduecontactpredictionmethodsfromretrospectivetoprospective
AT weiyanjie evaluationofresidueresiduecontactpredictionmethodsfromretrospectivetoprospective