Cargando…

A Benchmark Dataset for Evaluating Practical Performance of Model Quality Assessment of Homology Models

Protein structure prediction is an important issue in structural bioinformatics. In this process, model quality assessment (MQA), which estimates the accuracy of the predicted structure, is also practically important. Currently, the most commonly used dataset to evaluate the performance of MQA is th...

Descripción completa

Detalles Bibliográficos
Autores principales: Takei, Yuma, Ishida, Takashi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8945737/
https://www.ncbi.nlm.nih.gov/pubmed/35324806
http://dx.doi.org/10.3390/bioengineering9030118
_version_ 1784674024844951552
author Takei, Yuma
Ishida, Takashi
author_facet Takei, Yuma
Ishida, Takashi
author_sort Takei, Yuma
collection PubMed
description Protein structure prediction is an important issue in structural bioinformatics. In this process, model quality assessment (MQA), which estimates the accuracy of the predicted structure, is also practically important. Currently, the most commonly used dataset to evaluate the performance of MQA is the critical assessment of the protein structure prediction (CASP) dataset. However, the CASP dataset does not contain enough targets with high-quality models, and thus cannot sufficiently evaluate the MQA performance in practical use. Additionally, most application studies employ homology modeling because of its reliability. However, the CASP dataset includes models generated by de novo methods, which may lead to the mis-estimation of MQA performance. In this study, we created new benchmark datasets, named a homology models dataset for model quality assessment (HMDM), that contain targets with high-quality models derived using homology modeling. We then benchmarked the performance of the MQA methods using the new datasets and compared their performance to that of the classical selection based on the sequence identity of the template proteins. The results showed that model selection by the latest MQA methods using deep learning is better than selection by template sequence identity and classical statistical potentials. Using HMDM, it is possible to verify the MQA performance for high-accuracy homology models.
format Online
Article
Text
id pubmed-8945737
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-89457372022-03-25 A Benchmark Dataset for Evaluating Practical Performance of Model Quality Assessment of Homology Models Takei, Yuma Ishida, Takashi Bioengineering (Basel) Article Protein structure prediction is an important issue in structural bioinformatics. In this process, model quality assessment (MQA), which estimates the accuracy of the predicted structure, is also practically important. Currently, the most commonly used dataset to evaluate the performance of MQA is the critical assessment of the protein structure prediction (CASP) dataset. However, the CASP dataset does not contain enough targets with high-quality models, and thus cannot sufficiently evaluate the MQA performance in practical use. Additionally, most application studies employ homology modeling because of its reliability. However, the CASP dataset includes models generated by de novo methods, which may lead to the mis-estimation of MQA performance. In this study, we created new benchmark datasets, named a homology models dataset for model quality assessment (HMDM), that contain targets with high-quality models derived using homology modeling. We then benchmarked the performance of the MQA methods using the new datasets and compared their performance to that of the classical selection based on the sequence identity of the template proteins. The results showed that model selection by the latest MQA methods using deep learning is better than selection by template sequence identity and classical statistical potentials. Using HMDM, it is possible to verify the MQA performance for high-accuracy homology models. MDPI 2022-03-15 /pmc/articles/PMC8945737/ /pubmed/35324806 http://dx.doi.org/10.3390/bioengineering9030118 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Takei, Yuma
Ishida, Takashi
A Benchmark Dataset for Evaluating Practical Performance of Model Quality Assessment of Homology Models
title A Benchmark Dataset for Evaluating Practical Performance of Model Quality Assessment of Homology Models
title_full A Benchmark Dataset for Evaluating Practical Performance of Model Quality Assessment of Homology Models
title_fullStr A Benchmark Dataset for Evaluating Practical Performance of Model Quality Assessment of Homology Models
title_full_unstemmed A Benchmark Dataset for Evaluating Practical Performance of Model Quality Assessment of Homology Models
title_short A Benchmark Dataset for Evaluating Practical Performance of Model Quality Assessment of Homology Models
title_sort benchmark dataset for evaluating practical performance of model quality assessment of homology models
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8945737/
https://www.ncbi.nlm.nih.gov/pubmed/35324806
http://dx.doi.org/10.3390/bioengineering9030118
work_keys_str_mv AT takeiyuma abenchmarkdatasetforevaluatingpracticalperformanceofmodelqualityassessmentofhomologymodels
AT ishidatakashi abenchmarkdatasetforevaluatingpracticalperformanceofmodelqualityassessmentofhomologymodels
AT takeiyuma benchmarkdatasetforevaluatingpracticalperformanceofmodelqualityassessmentofhomologymodels
AT ishidatakashi benchmarkdatasetforevaluatingpracticalperformanceofmodelqualityassessmentofhomologymodels