Cargando…

Predicting subcellular localization of multisite proteins using differently weighted multi-label k-nearest neighbors sets

BACKGROUND: For a protein to execute its function, ensuring its correct subcellular localization is essential. In addition to biological experiments, bioinformatics is widely used to predict and determine the subcellular localization of proteins. However, single-feature extraction methods cannot eff...

Descripción completa

Detalles Bibliográficos
Autores principales: Jiang, Zhongting, Wang, Dong, Wu, Peng, Chen, Yuehui, Shang, Huijie, Wang, Luyao, Xie, Huichun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: IOS Press 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6598103/
https://www.ncbi.nlm.nih.gov/pubmed/31045538
http://dx.doi.org/10.3233/THC-199018
_version_ 1783430704090054656
author Jiang, Zhongting
Wang, Dong
Wu, Peng
Chen, Yuehui
Shang, Huijie
Wang, Luyao
Xie, Huichun
author_facet Jiang, Zhongting
Wang, Dong
Wu, Peng
Chen, Yuehui
Shang, Huijie
Wang, Luyao
Xie, Huichun
author_sort Jiang, Zhongting
collection PubMed
description BACKGROUND: For a protein to execute its function, ensuring its correct subcellular localization is essential. In addition to biological experiments, bioinformatics is widely used to predict and determine the subcellular localization of proteins. However, single-feature extraction methods cannot effectively handle the huge amount of data and multisite localization of proteins. Thus, we developed a pseudo amino acid composition (PseAAC) method and an entropy density technique to extract feature fusion information from subcellular multisite proteins. OBJECTIVE: Predicting multiplex protein subcellular localization and achieve high prediction accuracy. METHOD: To improve the efficiency of predicting multiplex protein subcellular localization, we used the multi-label k-nearest neighbors algorithm and assigned different weights to various attributes. The method was evaluated using several performance metrics with a dataset consisting of protein sequences with single-site and multisite subcellular localizations. RESULTS: Evaluation experiments showed that the proposed method significantly improves the optimal overall accuracy rate of multiplex protein subcellular localization. CONCLUSION: This method can help to more comprehensively predict protein subcellular localization toward better understanding protein function, thereby bridging the gap between theory and application toward improved identification and monitoring of drug targets.
format Online
Article
Text
id pubmed-6598103
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher IOS Press
record_format MEDLINE/PubMed
spelling pubmed-65981032019-07-01 Predicting subcellular localization of multisite proteins using differently weighted multi-label k-nearest neighbors sets Jiang, Zhongting Wang, Dong Wu, Peng Chen, Yuehui Shang, Huijie Wang, Luyao Xie, Huichun Technol Health Care Research Article BACKGROUND: For a protein to execute its function, ensuring its correct subcellular localization is essential. In addition to biological experiments, bioinformatics is widely used to predict and determine the subcellular localization of proteins. However, single-feature extraction methods cannot effectively handle the huge amount of data and multisite localization of proteins. Thus, we developed a pseudo amino acid composition (PseAAC) method and an entropy density technique to extract feature fusion information from subcellular multisite proteins. OBJECTIVE: Predicting multiplex protein subcellular localization and achieve high prediction accuracy. METHOD: To improve the efficiency of predicting multiplex protein subcellular localization, we used the multi-label k-nearest neighbors algorithm and assigned different weights to various attributes. The method was evaluated using several performance metrics with a dataset consisting of protein sequences with single-site and multisite subcellular localizations. RESULTS: Evaluation experiments showed that the proposed method significantly improves the optimal overall accuracy rate of multiplex protein subcellular localization. CONCLUSION: This method can help to more comprehensively predict protein subcellular localization toward better understanding protein function, thereby bridging the gap between theory and application toward improved identification and monitoring of drug targets. IOS Press 2019-06-18 /pmc/articles/PMC6598103/ /pubmed/31045538 http://dx.doi.org/10.3233/THC-199018 Text en © 2019 – IOS Press and the authors. All rights reserved https://creativecommons.org/licenses/by-nc/4.0/ This article is published online with Open Access and distributed under the terms of the Creative Commons Attribution Non-Commercial License (CC BY-NC 4.0).
spellingShingle Research Article
Jiang, Zhongting
Wang, Dong
Wu, Peng
Chen, Yuehui
Shang, Huijie
Wang, Luyao
Xie, Huichun
Predicting subcellular localization of multisite proteins using differently weighted multi-label k-nearest neighbors sets
title Predicting subcellular localization of multisite proteins using differently weighted multi-label k-nearest neighbors sets
title_full Predicting subcellular localization of multisite proteins using differently weighted multi-label k-nearest neighbors sets
title_fullStr Predicting subcellular localization of multisite proteins using differently weighted multi-label k-nearest neighbors sets
title_full_unstemmed Predicting subcellular localization of multisite proteins using differently weighted multi-label k-nearest neighbors sets
title_short Predicting subcellular localization of multisite proteins using differently weighted multi-label k-nearest neighbors sets
title_sort predicting subcellular localization of multisite proteins using differently weighted multi-label k-nearest neighbors sets
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6598103/
https://www.ncbi.nlm.nih.gov/pubmed/31045538
http://dx.doi.org/10.3233/THC-199018
work_keys_str_mv AT jiangzhongting predictingsubcellularlocalizationofmultisiteproteinsusingdifferentlyweightedmultilabelknearestneighborssets
AT wangdong predictingsubcellularlocalizationofmultisiteproteinsusingdifferentlyweightedmultilabelknearestneighborssets
AT wupeng predictingsubcellularlocalizationofmultisiteproteinsusingdifferentlyweightedmultilabelknearestneighborssets
AT chenyuehui predictingsubcellularlocalizationofmultisiteproteinsusingdifferentlyweightedmultilabelknearestneighborssets
AT shanghuijie predictingsubcellularlocalizationofmultisiteproteinsusingdifferentlyweightedmultilabelknearestneighborssets
AT wangluyao predictingsubcellularlocalizationofmultisiteproteinsusingdifferentlyweightedmultilabelknearestneighborssets
AT xiehuichun predictingsubcellularlocalizationofmultisiteproteinsusingdifferentlyweightedmultilabelknearestneighborssets