Cargando…

COMTOP: Protein Residue–Residue Contact Prediction through Mixed Integer Linear Optimization

Protein contact prediction helps reconstruct the tertiary structure that greatly determines a protein’s function; therefore, contact prediction from the sequence is an important problem. Recently there has been exciting progress on this problem, but many of the existing methods are still low quality...

Descripción completa

Detalles Bibliográficos
Autores principales: Reza, Md. Selim, Zhang, Huiling, Hossain, Md. Tofazzal, Jin, Langxi, Feng, Shengzhong, Wei, Yanjie
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8305966/
https://www.ncbi.nlm.nih.gov/pubmed/34209399
http://dx.doi.org/10.3390/membranes11070503
_version_ 1783727697319428096
author Reza, Md. Selim
Zhang, Huiling
Hossain, Md. Tofazzal
Jin, Langxi
Feng, Shengzhong
Wei, Yanjie
author_facet Reza, Md. Selim
Zhang, Huiling
Hossain, Md. Tofazzal
Jin, Langxi
Feng, Shengzhong
Wei, Yanjie
author_sort Reza, Md. Selim
collection PubMed
description Protein contact prediction helps reconstruct the tertiary structure that greatly determines a protein’s function; therefore, contact prediction from the sequence is an important problem. Recently there has been exciting progress on this problem, but many of the existing methods are still low quality of prediction accuracy. In this paper, we present a new mixed integer linear programming (MILP)-based consensus method: a Consensus scheme based On a Mixed integer linear opTimization method for prOtein contact Prediction (COMTOP). The MILP-based consensus method combines the strengths of seven selected protein contact prediction methods, including CCMpred, EVfold, DeepCov, NNcon, PconsC4, plmDCA, and PSICOV, by optimizing the number of correctly predicted contacts and achieving a better prediction accuracy. The proposed hybrid protein residue–residue contact prediction scheme was tested in four independent test sets. For 239 highly non-redundant proteins, the method showed a prediction accuracy of 59.68%, 70.79%, 78.86%, 89.04%, 94.51%, and 97.35% for top-5L, top-3L, top-2L, top-L, top-L/2, and top-L/5 contacts, respectively. When tested on the CASP13 and CASP14 test sets, the proposed method obtained accuracies of 75.91% and 77.49% for top-L/5 predictions, respectively. COMTOP was further tested on 57 non-redundant α-helical transmembrane proteins and achieved prediction accuracies of 64.34% and 73.91% for top-L/2 and top-L/5 predictions, respectively. For all test datasets, the improvement of COMTOP in accuracy over the seven individual methods increased with the increasing number of predicted contacts. For example, COMTOP performed much better for large number of contact predictions (such as top-5L and top-3L) than for small number of contact predictions such as top-L/2 and top-L/5. The results and analysis demonstrate that COMTOP can significantly improve the performance of the individual methods; therefore, COMTOP is more robust against different types of test sets. COMTOP also showed better/comparable predictions when compared with the state-of-the-art predictors.
format Online
Article
Text
id pubmed-8305966
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-83059662021-07-25 COMTOP: Protein Residue–Residue Contact Prediction through Mixed Integer Linear Optimization Reza, Md. Selim Zhang, Huiling Hossain, Md. Tofazzal Jin, Langxi Feng, Shengzhong Wei, Yanjie Membranes (Basel) Article Protein contact prediction helps reconstruct the tertiary structure that greatly determines a protein’s function; therefore, contact prediction from the sequence is an important problem. Recently there has been exciting progress on this problem, but many of the existing methods are still low quality of prediction accuracy. In this paper, we present a new mixed integer linear programming (MILP)-based consensus method: a Consensus scheme based On a Mixed integer linear opTimization method for prOtein contact Prediction (COMTOP). The MILP-based consensus method combines the strengths of seven selected protein contact prediction methods, including CCMpred, EVfold, DeepCov, NNcon, PconsC4, plmDCA, and PSICOV, by optimizing the number of correctly predicted contacts and achieving a better prediction accuracy. The proposed hybrid protein residue–residue contact prediction scheme was tested in four independent test sets. For 239 highly non-redundant proteins, the method showed a prediction accuracy of 59.68%, 70.79%, 78.86%, 89.04%, 94.51%, and 97.35% for top-5L, top-3L, top-2L, top-L, top-L/2, and top-L/5 contacts, respectively. When tested on the CASP13 and CASP14 test sets, the proposed method obtained accuracies of 75.91% and 77.49% for top-L/5 predictions, respectively. COMTOP was further tested on 57 non-redundant α-helical transmembrane proteins and achieved prediction accuracies of 64.34% and 73.91% for top-L/2 and top-L/5 predictions, respectively. For all test datasets, the improvement of COMTOP in accuracy over the seven individual methods increased with the increasing number of predicted contacts. For example, COMTOP performed much better for large number of contact predictions (such as top-5L and top-3L) than for small number of contact predictions such as top-L/2 and top-L/5. The results and analysis demonstrate that COMTOP can significantly improve the performance of the individual methods; therefore, COMTOP is more robust against different types of test sets. COMTOP also showed better/comparable predictions when compared with the state-of-the-art predictors. MDPI 2021-06-30 /pmc/articles/PMC8305966/ /pubmed/34209399 http://dx.doi.org/10.3390/membranes11070503 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Reza, Md. Selim
Zhang, Huiling
Hossain, Md. Tofazzal
Jin, Langxi
Feng, Shengzhong
Wei, Yanjie
COMTOP: Protein Residue–Residue Contact Prediction through Mixed Integer Linear Optimization
title COMTOP: Protein Residue–Residue Contact Prediction through Mixed Integer Linear Optimization
title_full COMTOP: Protein Residue–Residue Contact Prediction through Mixed Integer Linear Optimization
title_fullStr COMTOP: Protein Residue–Residue Contact Prediction through Mixed Integer Linear Optimization
title_full_unstemmed COMTOP: Protein Residue–Residue Contact Prediction through Mixed Integer Linear Optimization
title_short COMTOP: Protein Residue–Residue Contact Prediction through Mixed Integer Linear Optimization
title_sort comtop: protein residue–residue contact prediction through mixed integer linear optimization
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8305966/
https://www.ncbi.nlm.nih.gov/pubmed/34209399
http://dx.doi.org/10.3390/membranes11070503
work_keys_str_mv AT rezamdselim comtopproteinresidueresiduecontactpredictionthroughmixedintegerlinearoptimization
AT zhanghuiling comtopproteinresidueresiduecontactpredictionthroughmixedintegerlinearoptimization
AT hossainmdtofazzal comtopproteinresidueresiduecontactpredictionthroughmixedintegerlinearoptimization
AT jinlangxi comtopproteinresidueresiduecontactpredictionthroughmixedintegerlinearoptimization
AT fengshengzhong comtopproteinresidueresiduecontactpredictionthroughmixedintegerlinearoptimization
AT weiyanjie comtopproteinresidueresiduecontactpredictionthroughmixedintegerlinearoptimization