Cargando…

PClass: Protein Quaternary Structure Classification by Using Bootstrapping Strategy as Model Selection

Protein quaternary structure complex is also known as a multimer, which plays an important role in a cell. The dimer structure of transcription factors is involved in gene regulation, but the trimer structure of virus-infection-associated glycoproteins is related to the human immunodeficiency virus....

Descripción completa

Detalles Bibliográficos
Autores principales: Huang, Chi-Chou, Chang, Chi-Chang, Chen, Chi-Wei, Ho, Shao-yu, Chang, Hsung-Pin, Chu, Yen-Wei
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5852587/
https://www.ncbi.nlm.nih.gov/pubmed/29443925
http://dx.doi.org/10.3390/genes9020091
_version_ 1783306599813611520
author Huang, Chi-Chou
Chang, Chi-Chang
Chen, Chi-Wei
Ho, Shao-yu
Chang, Hsung-Pin
Chu, Yen-Wei
author_facet Huang, Chi-Chou
Chang, Chi-Chang
Chen, Chi-Wei
Ho, Shao-yu
Chang, Hsung-Pin
Chu, Yen-Wei
author_sort Huang, Chi-Chou
collection PubMed
description Protein quaternary structure complex is also known as a multimer, which plays an important role in a cell. The dimer structure of transcription factors is involved in gene regulation, but the trimer structure of virus-infection-associated glycoproteins is related to the human immunodeficiency virus. The classification of the protein quaternary structure complex for the post-genome era of proteomics research will be of great help. Classification systems among protein quaternary structures have not been widely developed. Therefore, we designed the architecture of a two-layer machine learning technique in this study, and developed the classification system PClass. The protein quaternary structure of the complex is divided into five categories, namely, monomer, dimer, trimer, tetramer, and other subunit classes. In the framework of the bootstrap method with a support vector machine, we propose a new model selection method. Each type of complex is classified based on sequences, entropy, and accessible surface area, thereby generating a plurality of feature modules. Subsequently, the optimal model of effectiveness is selected as each kind of complex feature module. In this stage, the optimal performance can reach as high as 70% of Matthews correlation coefficient (MCC). The second layer of construction combines the first-layer module to integrate mechanisms and the use of six machine learning methods to improve the prediction performance. This system can be improved over 10% in MCC. Finally, we analyzed the performance of our classification system using transcription factors in dimer structure and virus-infection-associated glycoprotein in trimer structure. PClass is available via a web interface at http://predictor.nchu.edu.tw/PClass/.
format Online
Article
Text
id pubmed-5852587
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-58525872018-03-19 PClass: Protein Quaternary Structure Classification by Using Bootstrapping Strategy as Model Selection Huang, Chi-Chou Chang, Chi-Chang Chen, Chi-Wei Ho, Shao-yu Chang, Hsung-Pin Chu, Yen-Wei Genes (Basel) Article Protein quaternary structure complex is also known as a multimer, which plays an important role in a cell. The dimer structure of transcription factors is involved in gene regulation, but the trimer structure of virus-infection-associated glycoproteins is related to the human immunodeficiency virus. The classification of the protein quaternary structure complex for the post-genome era of proteomics research will be of great help. Classification systems among protein quaternary structures have not been widely developed. Therefore, we designed the architecture of a two-layer machine learning technique in this study, and developed the classification system PClass. The protein quaternary structure of the complex is divided into five categories, namely, monomer, dimer, trimer, tetramer, and other subunit classes. In the framework of the bootstrap method with a support vector machine, we propose a new model selection method. Each type of complex is classified based on sequences, entropy, and accessible surface area, thereby generating a plurality of feature modules. Subsequently, the optimal model of effectiveness is selected as each kind of complex feature module. In this stage, the optimal performance can reach as high as 70% of Matthews correlation coefficient (MCC). The second layer of construction combines the first-layer module to integrate mechanisms and the use of six machine learning methods to improve the prediction performance. This system can be improved over 10% in MCC. Finally, we analyzed the performance of our classification system using transcription factors in dimer structure and virus-infection-associated glycoprotein in trimer structure. PClass is available via a web interface at http://predictor.nchu.edu.tw/PClass/. MDPI 2018-02-14 /pmc/articles/PMC5852587/ /pubmed/29443925 http://dx.doi.org/10.3390/genes9020091 Text en © 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Huang, Chi-Chou
Chang, Chi-Chang
Chen, Chi-Wei
Ho, Shao-yu
Chang, Hsung-Pin
Chu, Yen-Wei
PClass: Protein Quaternary Structure Classification by Using Bootstrapping Strategy as Model Selection
title PClass: Protein Quaternary Structure Classification by Using Bootstrapping Strategy as Model Selection
title_full PClass: Protein Quaternary Structure Classification by Using Bootstrapping Strategy as Model Selection
title_fullStr PClass: Protein Quaternary Structure Classification by Using Bootstrapping Strategy as Model Selection
title_full_unstemmed PClass: Protein Quaternary Structure Classification by Using Bootstrapping Strategy as Model Selection
title_short PClass: Protein Quaternary Structure Classification by Using Bootstrapping Strategy as Model Selection
title_sort pclass: protein quaternary structure classification by using bootstrapping strategy as model selection
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5852587/
https://www.ncbi.nlm.nih.gov/pubmed/29443925
http://dx.doi.org/10.3390/genes9020091
work_keys_str_mv AT huangchichou pclassproteinquaternarystructureclassificationbyusingbootstrappingstrategyasmodelselection
AT changchichang pclassproteinquaternarystructureclassificationbyusingbootstrappingstrategyasmodelselection
AT chenchiwei pclassproteinquaternarystructureclassificationbyusingbootstrappingstrategyasmodelselection
AT hoshaoyu pclassproteinquaternarystructureclassificationbyusingbootstrappingstrategyasmodelselection
AT changhsungpin pclassproteinquaternarystructureclassificationbyusingbootstrappingstrategyasmodelselection
AT chuyenwei pclassproteinquaternarystructureclassificationbyusingbootstrappingstrategyasmodelselection