Cargando…
Using Recursive Feature Selection with Random Forest to Improve Protein Structural Class Prediction for Low-Similarity Sequences
Many combinations of protein features are used to improve protein structural class prediction, but the information redundancy is often ignored. In order to select the important features with strong classification ability, we proposed a recursive feature selection with random forest to improve protei...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Hindawi
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8123985/ https://www.ncbi.nlm.nih.gov/pubmed/34055035 http://dx.doi.org/10.1155/2021/5529389 |
_version_ | 1783693077650604032 |
---|---|
author | Wang, Yaoxin Xu, Yingjie Yang, Zhenyu Liu, Xiaoqing Dai, Qi |
author_facet | Wang, Yaoxin Xu, Yingjie Yang, Zhenyu Liu, Xiaoqing Dai, Qi |
author_sort | Wang, Yaoxin |
collection | PubMed |
description | Many combinations of protein features are used to improve protein structural class prediction, but the information redundancy is often ignored. In order to select the important features with strong classification ability, we proposed a recursive feature selection with random forest to improve protein structural class prediction. We evaluated the proposed method with four experiments and compared it with the available competing prediction methods. The results indicate that the proposed feature selection method effectively improves the efficiency of protein structural class prediction. Only less than 5% features are used, but the prediction accuracy is improved by 4.6-13.3%. We further compared different protein features and found that the predicted secondary structural features achieve the best performance. This understanding can be used to design more powerful prediction methods for the protein structural class. |
format | Online Article Text |
id | pubmed-8123985 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Hindawi |
record_format | MEDLINE/PubMed |
spelling | pubmed-81239852021-05-27 Using Recursive Feature Selection with Random Forest to Improve Protein Structural Class Prediction for Low-Similarity Sequences Wang, Yaoxin Xu, Yingjie Yang, Zhenyu Liu, Xiaoqing Dai, Qi Comput Math Methods Med Research Article Many combinations of protein features are used to improve protein structural class prediction, but the information redundancy is often ignored. In order to select the important features with strong classification ability, we proposed a recursive feature selection with random forest to improve protein structural class prediction. We evaluated the proposed method with four experiments and compared it with the available competing prediction methods. The results indicate that the proposed feature selection method effectively improves the efficiency of protein structural class prediction. Only less than 5% features are used, but the prediction accuracy is improved by 4.6-13.3%. We further compared different protein features and found that the predicted secondary structural features achieve the best performance. This understanding can be used to design more powerful prediction methods for the protein structural class. Hindawi 2021-05-07 /pmc/articles/PMC8123985/ /pubmed/34055035 http://dx.doi.org/10.1155/2021/5529389 Text en Copyright © 2021 Yaoxin Wang et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Wang, Yaoxin Xu, Yingjie Yang, Zhenyu Liu, Xiaoqing Dai, Qi Using Recursive Feature Selection with Random Forest to Improve Protein Structural Class Prediction for Low-Similarity Sequences |
title | Using Recursive Feature Selection with Random Forest to Improve Protein Structural Class Prediction for Low-Similarity Sequences |
title_full | Using Recursive Feature Selection with Random Forest to Improve Protein Structural Class Prediction for Low-Similarity Sequences |
title_fullStr | Using Recursive Feature Selection with Random Forest to Improve Protein Structural Class Prediction for Low-Similarity Sequences |
title_full_unstemmed | Using Recursive Feature Selection with Random Forest to Improve Protein Structural Class Prediction for Low-Similarity Sequences |
title_short | Using Recursive Feature Selection with Random Forest to Improve Protein Structural Class Prediction for Low-Similarity Sequences |
title_sort | using recursive feature selection with random forest to improve protein structural class prediction for low-similarity sequences |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8123985/ https://www.ncbi.nlm.nih.gov/pubmed/34055035 http://dx.doi.org/10.1155/2021/5529389 |
work_keys_str_mv | AT wangyaoxin usingrecursivefeatureselectionwithrandomforesttoimproveproteinstructuralclasspredictionforlowsimilaritysequences AT xuyingjie usingrecursivefeatureselectionwithrandomforesttoimproveproteinstructuralclasspredictionforlowsimilaritysequences AT yangzhenyu usingrecursivefeatureselectionwithrandomforesttoimproveproteinstructuralclasspredictionforlowsimilaritysequences AT liuxiaoqing usingrecursivefeatureselectionwithrandomforesttoimproveproteinstructuralclasspredictionforlowsimilaritysequences AT daiqi usingrecursivefeatureselectionwithrandomforesttoimproveproteinstructuralclasspredictionforlowsimilaritysequences |