Cargando…

The effect of three novel feature extraction methods on the prediction of the subcellular localization of multi-site virus proteins

Experimental methods play a crucial role in identifying the subcellular localization of proteins and building high-quality databases. However, more efficient, automated computational methods are required to predict the subcellular localization of proteins on a large scale. Various efficient feature...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Lei, Zhao, Yaou, Chen, Yuehui, Wang, Dong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Taylor & Francis 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5972939/
https://www.ncbi.nlm.nih.gov/pubmed/28886267
http://dx.doi.org/10.1080/21655979.2017.1373536
_version_ 1783326498419113984
author Wang, Lei
Zhao, Yaou
Chen, Yuehui
Wang, Dong
author_facet Wang, Lei
Zhao, Yaou
Chen, Yuehui
Wang, Dong
author_sort Wang, Lei
collection PubMed
description Experimental methods play a crucial role in identifying the subcellular localization of proteins and building high-quality databases. However, more efficient, automated computational methods are required to predict the subcellular localization of proteins on a large scale. Various efficient feature extraction methods have been proposed to predict subcellular localization, but challenges remain. In this paper, three novel feature extraction methods are established to improve multi-site prediction. The first novel feature extraction method utilizes repetitive information via moving windows based on a dipeptide pseudo amino acid composition method (R-Dipeptide). The second novel feature extraction method utilizes the impact of each amino acid residue on its following residues based on pseudo amino acids (I-PseAAC). The third novel feature extraction method provides local information about protein sequences that reflects the strength of the physicochemical properties of residues (PseAAC2). The multi-label k-nearest neighbor algorithm (MLKNN) is used to predict the subcellular localization of multi-site virus proteins. The best overall accuracy values of R-Dipeptide, I-PseAAC, and PseAAC2 when applied to dataset S from Virus-mPloc are 59.92%, 59.13%, and 57.94% respectively.
format Online
Article
Text
id pubmed-5972939
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Taylor & Francis
record_format MEDLINE/PubMed
spelling pubmed-59729392018-11-22 The effect of three novel feature extraction methods on the prediction of the subcellular localization of multi-site virus proteins Wang, Lei Zhao, Yaou Chen, Yuehui Wang, Dong Bioengineered Research Paper Experimental methods play a crucial role in identifying the subcellular localization of proteins and building high-quality databases. However, more efficient, automated computational methods are required to predict the subcellular localization of proteins on a large scale. Various efficient feature extraction methods have been proposed to predict subcellular localization, but challenges remain. In this paper, three novel feature extraction methods are established to improve multi-site prediction. The first novel feature extraction method utilizes repetitive information via moving windows based on a dipeptide pseudo amino acid composition method (R-Dipeptide). The second novel feature extraction method utilizes the impact of each amino acid residue on its following residues based on pseudo amino acids (I-PseAAC). The third novel feature extraction method provides local information about protein sequences that reflects the strength of the physicochemical properties of residues (PseAAC2). The multi-label k-nearest neighbor algorithm (MLKNN) is used to predict the subcellular localization of multi-site virus proteins. The best overall accuracy values of R-Dipeptide, I-PseAAC, and PseAAC2 when applied to dataset S from Virus-mPloc are 59.92%, 59.13%, and 57.94% respectively. Taylor & Francis 2017-11-22 /pmc/articles/PMC5972939/ /pubmed/28886267 http://dx.doi.org/10.1080/21655979.2017.1373536 Text en © 2018 The Author(s). Published by Informa UK Limited, trading as Taylor & Francis Group http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Paper
Wang, Lei
Zhao, Yaou
Chen, Yuehui
Wang, Dong
The effect of three novel feature extraction methods on the prediction of the subcellular localization of multi-site virus proteins
title The effect of three novel feature extraction methods on the prediction of the subcellular localization of multi-site virus proteins
title_full The effect of three novel feature extraction methods on the prediction of the subcellular localization of multi-site virus proteins
title_fullStr The effect of three novel feature extraction methods on the prediction of the subcellular localization of multi-site virus proteins
title_full_unstemmed The effect of three novel feature extraction methods on the prediction of the subcellular localization of multi-site virus proteins
title_short The effect of three novel feature extraction methods on the prediction of the subcellular localization of multi-site virus proteins
title_sort effect of three novel feature extraction methods on the prediction of the subcellular localization of multi-site virus proteins
topic Research Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5972939/
https://www.ncbi.nlm.nih.gov/pubmed/28886267
http://dx.doi.org/10.1080/21655979.2017.1373536
work_keys_str_mv AT wanglei theeffectofthreenovelfeatureextractionmethodsonthepredictionofthesubcellularlocalizationofmultisitevirusproteins
AT zhaoyaou theeffectofthreenovelfeatureextractionmethodsonthepredictionofthesubcellularlocalizationofmultisitevirusproteins
AT chenyuehui theeffectofthreenovelfeatureextractionmethodsonthepredictionofthesubcellularlocalizationofmultisitevirusproteins
AT wangdong theeffectofthreenovelfeatureextractionmethodsonthepredictionofthesubcellularlocalizationofmultisitevirusproteins
AT wanglei effectofthreenovelfeatureextractionmethodsonthepredictionofthesubcellularlocalizationofmultisitevirusproteins
AT zhaoyaou effectofthreenovelfeatureextractionmethodsonthepredictionofthesubcellularlocalizationofmultisitevirusproteins
AT chenyuehui effectofthreenovelfeatureextractionmethodsonthepredictionofthesubcellularlocalizationofmultisitevirusproteins
AT wangdong effectofthreenovelfeatureextractionmethodsonthepredictionofthesubcellularlocalizationofmultisitevirusproteins