Cargando…

MUfoldQA_G: High-accuracy protein model QA via retraining and transformation

Protein tertiary structure prediction is an active research area and has attracted significant attention recently due to the success of AlphaFold from DeepMind. Methods capable of accurately evaluating the quality of predicted models are of great importance. In the past, although many model quality...

Descripción completa

Detalles Bibliográficos
Autores principales:	Wang, Wenbo, Wang, Junlin, Li, Zhaoyu, Xu, Dong, Shang, Yi
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Research Network of Computational and Structural Biotechnology 2021
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8636996/ https://www.ncbi.nlm.nih.gov/pubmed/34900138 http://dx.doi.org/10.1016/j.csbj.2021.11.021

_version_	1784608652217286656
author	Wang, Wenbo Wang, Junlin Li, Zhaoyu Xu, Dong Shang, Yi
author_facet	Wang, Wenbo Wang, Junlin Li, Zhaoyu Xu, Dong Shang, Yi
author_sort	Wang, Wenbo
collection	PubMed
description	Protein tertiary structure prediction is an active research area and has attracted significant attention recently due to the success of AlphaFold from DeepMind. Methods capable of accurately evaluating the quality of predicted models are of great importance. In the past, although many model quality assessment (QA) methods have been developed, their accuracies are not consistently high across different QA performance metrics for diverse target proteins. In this paper, we propose MUfoldQA_G, a new multi-model QA method that aims at simultaneously optimizing Pearson correlation and average GDT-TS difference, two commonly used QA performance metrics. This method is based on two new algorithms MUfoldQA_Gp and MUfoldQA_Gr. MUfoldQA_Gp uses a new technique to combine information from protein templates and reference protein models to maximize the Pearson correlation QA metric. MUfoldQA_Gr employs a new machine learning technique that resamples training data and retrains adaptively to learn a consensus model that is better than naïve consensus while minimizing average GDT-TS difference. MUfoldQA_G uses a new method to combine the results of MUfoldQA_Gr and MUfoldQA_Gp so that the final QA prediction results achieve low average GDT-TS difference that is close to the results from MUfoldQA_Gr, while maintaining high Pearson correlation that is the same as the results from MUfoldQA_Gp. In CASP14 QA categories, MUfoldQA_G ranked No. 1 in Pearson correlation and No. 2 in average GDT-TS difference.
format	Online Article Text
id	pubmed-8636996
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	Research Network of Computational and Structural Biotechnology
record_format	MEDLINE/PubMed
spelling	pubmed-86369962021-12-09 MUfoldQA_G: High-accuracy protein model QA via retraining and transformation Wang, Wenbo Wang, Junlin Li, Zhaoyu Xu, Dong Shang, Yi Comput Struct Biotechnol J Research Article Protein tertiary structure prediction is an active research area and has attracted significant attention recently due to the success of AlphaFold from DeepMind. Methods capable of accurately evaluating the quality of predicted models are of great importance. In the past, although many model quality assessment (QA) methods have been developed, their accuracies are not consistently high across different QA performance metrics for diverse target proteins. In this paper, we propose MUfoldQA_G, a new multi-model QA method that aims at simultaneously optimizing Pearson correlation and average GDT-TS difference, two commonly used QA performance metrics. This method is based on two new algorithms MUfoldQA_Gp and MUfoldQA_Gr. MUfoldQA_Gp uses a new technique to combine information from protein templates and reference protein models to maximize the Pearson correlation QA metric. MUfoldQA_Gr employs a new machine learning technique that resamples training data and retrains adaptively to learn a consensus model that is better than naïve consensus while minimizing average GDT-TS difference. MUfoldQA_G uses a new method to combine the results of MUfoldQA_Gr and MUfoldQA_Gp so that the final QA prediction results achieve low average GDT-TS difference that is close to the results from MUfoldQA_Gr, while maintaining high Pearson correlation that is the same as the results from MUfoldQA_Gp. In CASP14 QA categories, MUfoldQA_G ranked No. 1 in Pearson correlation and No. 2 in average GDT-TS difference. Research Network of Computational and Structural Biotechnology 2021-11-23 /pmc/articles/PMC8636996/ /pubmed/34900138 http://dx.doi.org/10.1016/j.csbj.2021.11.021 Text en © 2021 The Author(s) https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle	Research Article Wang, Wenbo Wang, Junlin Li, Zhaoyu Xu, Dong Shang, Yi MUfoldQA_G: High-accuracy protein model QA via retraining and transformation
title	MUfoldQA_G: High-accuracy protein model QA via retraining and transformation
title_full	MUfoldQA_G: High-accuracy protein model QA via retraining and transformation
title_fullStr	MUfoldQA_G: High-accuracy protein model QA via retraining and transformation
title_full_unstemmed	MUfoldQA_G: High-accuracy protein model QA via retraining and transformation
title_short	MUfoldQA_G: High-accuracy protein model QA via retraining and transformation
title_sort	mufoldqa_g: high-accuracy protein model qa via retraining and transformation
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8636996/ https://www.ncbi.nlm.nih.gov/pubmed/34900138 http://dx.doi.org/10.1016/j.csbj.2021.11.021
work_keys_str_mv	AT wangwenbo mufoldqaghighaccuracyproteinmodelqaviaretrainingandtransformation AT wangjunlin mufoldqaghighaccuracyproteinmodelqaviaretrainingandtransformation AT lizhaoyu mufoldqaghighaccuracyproteinmodelqaviaretrainingandtransformation AT xudong mufoldqaghighaccuracyproteinmodelqaviaretrainingandtransformation AT shangyi mufoldqaghighaccuracyproteinmodelqaviaretrainingandtransformation

MUfoldQA_G: High-accuracy protein model QA via retraining and transformation

Ejemplares similares