Cargando…
The effect of three novel feature extraction methods on the prediction of the subcellular localization of multi-site virus proteins
Experimental methods play a crucial role in identifying the subcellular localization of proteins and building high-quality databases. However, more efficient, automated computational methods are required to predict the subcellular localization of proteins on a large scale. Various efficient feature...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Taylor & Francis
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5972939/ https://www.ncbi.nlm.nih.gov/pubmed/28886267 http://dx.doi.org/10.1080/21655979.2017.1373536 |
_version_ | 1783326498419113984 |
---|---|
author | Wang, Lei Zhao, Yaou Chen, Yuehui Wang, Dong |
author_facet | Wang, Lei Zhao, Yaou Chen, Yuehui Wang, Dong |
author_sort | Wang, Lei |
collection | PubMed |
description | Experimental methods play a crucial role in identifying the subcellular localization of proteins and building high-quality databases. However, more efficient, automated computational methods are required to predict the subcellular localization of proteins on a large scale. Various efficient feature extraction methods have been proposed to predict subcellular localization, but challenges remain. In this paper, three novel feature extraction methods are established to improve multi-site prediction. The first novel feature extraction method utilizes repetitive information via moving windows based on a dipeptide pseudo amino acid composition method (R-Dipeptide). The second novel feature extraction method utilizes the impact of each amino acid residue on its following residues based on pseudo amino acids (I-PseAAC). The third novel feature extraction method provides local information about protein sequences that reflects the strength of the physicochemical properties of residues (PseAAC2). The multi-label k-nearest neighbor algorithm (MLKNN) is used to predict the subcellular localization of multi-site virus proteins. The best overall accuracy values of R-Dipeptide, I-PseAAC, and PseAAC2 when applied to dataset S from Virus-mPloc are 59.92%, 59.13%, and 57.94% respectively. |
format | Online Article Text |
id | pubmed-5972939 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | Taylor & Francis |
record_format | MEDLINE/PubMed |
spelling | pubmed-59729392018-11-22 The effect of three novel feature extraction methods on the prediction of the subcellular localization of multi-site virus proteins Wang, Lei Zhao, Yaou Chen, Yuehui Wang, Dong Bioengineered Research Paper Experimental methods play a crucial role in identifying the subcellular localization of proteins and building high-quality databases. However, more efficient, automated computational methods are required to predict the subcellular localization of proteins on a large scale. Various efficient feature extraction methods have been proposed to predict subcellular localization, but challenges remain. In this paper, three novel feature extraction methods are established to improve multi-site prediction. The first novel feature extraction method utilizes repetitive information via moving windows based on a dipeptide pseudo amino acid composition method (R-Dipeptide). The second novel feature extraction method utilizes the impact of each amino acid residue on its following residues based on pseudo amino acids (I-PseAAC). The third novel feature extraction method provides local information about protein sequences that reflects the strength of the physicochemical properties of residues (PseAAC2). The multi-label k-nearest neighbor algorithm (MLKNN) is used to predict the subcellular localization of multi-site virus proteins. The best overall accuracy values of R-Dipeptide, I-PseAAC, and PseAAC2 when applied to dataset S from Virus-mPloc are 59.92%, 59.13%, and 57.94% respectively. Taylor & Francis 2017-11-22 /pmc/articles/PMC5972939/ /pubmed/28886267 http://dx.doi.org/10.1080/21655979.2017.1373536 Text en © 2018 The Author(s). Published by Informa UK Limited, trading as Taylor & Francis Group http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Paper Wang, Lei Zhao, Yaou Chen, Yuehui Wang, Dong The effect of three novel feature extraction methods on the prediction of the subcellular localization of multi-site virus proteins |
title | The effect of three novel feature extraction methods on the prediction of the subcellular localization of multi-site virus proteins |
title_full | The effect of three novel feature extraction methods on the prediction of the subcellular localization of multi-site virus proteins |
title_fullStr | The effect of three novel feature extraction methods on the prediction of the subcellular localization of multi-site virus proteins |
title_full_unstemmed | The effect of three novel feature extraction methods on the prediction of the subcellular localization of multi-site virus proteins |
title_short | The effect of three novel feature extraction methods on the prediction of the subcellular localization of multi-site virus proteins |
title_sort | effect of three novel feature extraction methods on the prediction of the subcellular localization of multi-site virus proteins |
topic | Research Paper |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5972939/ https://www.ncbi.nlm.nih.gov/pubmed/28886267 http://dx.doi.org/10.1080/21655979.2017.1373536 |
work_keys_str_mv | AT wanglei theeffectofthreenovelfeatureextractionmethodsonthepredictionofthesubcellularlocalizationofmultisitevirusproteins AT zhaoyaou theeffectofthreenovelfeatureextractionmethodsonthepredictionofthesubcellularlocalizationofmultisitevirusproteins AT chenyuehui theeffectofthreenovelfeatureextractionmethodsonthepredictionofthesubcellularlocalizationofmultisitevirusproteins AT wangdong theeffectofthreenovelfeatureextractionmethodsonthepredictionofthesubcellularlocalizationofmultisitevirusproteins AT wanglei effectofthreenovelfeatureextractionmethodsonthepredictionofthesubcellularlocalizationofmultisitevirusproteins AT zhaoyaou effectofthreenovelfeatureextractionmethodsonthepredictionofthesubcellularlocalizationofmultisitevirusproteins AT chenyuehui effectofthreenovelfeatureextractionmethodsonthepredictionofthesubcellularlocalizationofmultisitevirusproteins AT wangdong effectofthreenovelfeatureextractionmethodsonthepredictionofthesubcellularlocalizationofmultisitevirusproteins |