Cargando…
Identification of apolipoprotein using feature selection technique
Apolipoprotein is a kind of protein which can transport the lipids through the lymphatic and circulatory systems. The abnormal expression level of apolipoprotein always causes angiocardiopathy. Thus, correct recognition of apolipoprotein from proteomic data is very crucial to the comprehension of ca...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4957217/ https://www.ncbi.nlm.nih.gov/pubmed/27443605 http://dx.doi.org/10.1038/srep30441 |
_version_ | 1782444141416808448 |
---|---|
author | Tang, Hua Zou, Ping Zhang, Chunmei Chen, Rong Chen, Wei Lin, Hao |
author_facet | Tang, Hua Zou, Ping Zhang, Chunmei Chen, Rong Chen, Wei Lin, Hao |
author_sort | Tang, Hua |
collection | PubMed |
description | Apolipoprotein is a kind of protein which can transport the lipids through the lymphatic and circulatory systems. The abnormal expression level of apolipoprotein always causes angiocardiopathy. Thus, correct recognition of apolipoprotein from proteomic data is very crucial to the comprehension of cardiovascular system and drug design. This study is to develop a computational model to predict apolipoproteins. In the model, the apolipoproteins and non-apolipoproteins were collected to form benchmark dataset. On the basis of the dataset, we extracted the g-gap dipeptide composition information from residue sequences to formulate protein samples. To exclude redundant information or noise, the analysis of various (ANOVA)-based feature selection technique was proposed to find out the best feature subset. The support vector machine (SVM) was selected as discrimination algorithm. Results show that 96.2% of sensitivity and 99.3% of specificity were achieved in five-fold cross-validation. These findings open new perspectives to improve apolipoproteins prediction by considering the specific dipeptides. We expect that these findings will help to improve drug development in anti-angiocardiopathy disease. |
format | Online Article Text |
id | pubmed-4957217 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | Nature Publishing Group |
record_format | MEDLINE/PubMed |
spelling | pubmed-49572172016-07-26 Identification of apolipoprotein using feature selection technique Tang, Hua Zou, Ping Zhang, Chunmei Chen, Rong Chen, Wei Lin, Hao Sci Rep Article Apolipoprotein is a kind of protein which can transport the lipids through the lymphatic and circulatory systems. The abnormal expression level of apolipoprotein always causes angiocardiopathy. Thus, correct recognition of apolipoprotein from proteomic data is very crucial to the comprehension of cardiovascular system and drug design. This study is to develop a computational model to predict apolipoproteins. In the model, the apolipoproteins and non-apolipoproteins were collected to form benchmark dataset. On the basis of the dataset, we extracted the g-gap dipeptide composition information from residue sequences to formulate protein samples. To exclude redundant information or noise, the analysis of various (ANOVA)-based feature selection technique was proposed to find out the best feature subset. The support vector machine (SVM) was selected as discrimination algorithm. Results show that 96.2% of sensitivity and 99.3% of specificity were achieved in five-fold cross-validation. These findings open new perspectives to improve apolipoproteins prediction by considering the specific dipeptides. We expect that these findings will help to improve drug development in anti-angiocardiopathy disease. Nature Publishing Group 2016-07-22 /pmc/articles/PMC4957217/ /pubmed/27443605 http://dx.doi.org/10.1038/srep30441 Text en Copyright © 2016, The Author(s) http://creativecommons.org/licenses/by/4.0/ This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ |
spellingShingle | Article Tang, Hua Zou, Ping Zhang, Chunmei Chen, Rong Chen, Wei Lin, Hao Identification of apolipoprotein using feature selection technique |
title | Identification of apolipoprotein using feature selection technique |
title_full | Identification of apolipoprotein using feature selection technique |
title_fullStr | Identification of apolipoprotein using feature selection technique |
title_full_unstemmed | Identification of apolipoprotein using feature selection technique |
title_short | Identification of apolipoprotein using feature selection technique |
title_sort | identification of apolipoprotein using feature selection technique |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4957217/ https://www.ncbi.nlm.nih.gov/pubmed/27443605 http://dx.doi.org/10.1038/srep30441 |
work_keys_str_mv | AT tanghua identificationofapolipoproteinusingfeatureselectiontechnique AT zouping identificationofapolipoproteinusingfeatureselectiontechnique AT zhangchunmei identificationofapolipoproteinusingfeatureselectiontechnique AT chenrong identificationofapolipoproteinusingfeatureselectiontechnique AT chenwei identificationofapolipoproteinusingfeatureselectiontechnique AT linhao identificationofapolipoproteinusingfeatureselectiontechnique |