Cargando…
k-Skip-n-Gram-RF: A Random Forest Based Method for Alzheimer's Disease Protein Identification
In this paper, a computational method based on machine learning technique for identifying Alzheimer's disease genes is proposed. Compared with most existing machine learning based methods, existing methods predict Alzheimer's disease genes by using structural magnetic resonance imaging (MR...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6379451/ https://www.ncbi.nlm.nih.gov/pubmed/30809242 http://dx.doi.org/10.3389/fgene.2019.00033 |
_version_ | 1783396088629166080 |
---|---|
author | Xu, Lei Liang, Guangmin Liao, Changrui Chen, Gin-Den Chang, Chi-Chang |
author_facet | Xu, Lei Liang, Guangmin Liao, Changrui Chen, Gin-Den Chang, Chi-Chang |
author_sort | Xu, Lei |
collection | PubMed |
description | In this paper, a computational method based on machine learning technique for identifying Alzheimer's disease genes is proposed. Compared with most existing machine learning based methods, existing methods predict Alzheimer's disease genes by using structural magnetic resonance imaging (MRI) technique. Most methods have attained acceptable results, but the cost is expensive and time consuming. Thus, we proposed a computational method for identifying Alzheimer disease genes by use of the sequence information of proteins, and classify the feature vectors by random forest. In the proposed method, the gene protein information is extracted by adaptive k-skip-n-gram features. The proposed method can attain the accuracy to 85.5% on the selected UniProt dataset, which has been demonstrated by the experimental results. |
format | Online Article Text |
id | pubmed-6379451 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-63794512019-02-26 k-Skip-n-Gram-RF: A Random Forest Based Method for Alzheimer's Disease Protein Identification Xu, Lei Liang, Guangmin Liao, Changrui Chen, Gin-Den Chang, Chi-Chang Front Genet Genetics In this paper, a computational method based on machine learning technique for identifying Alzheimer's disease genes is proposed. Compared with most existing machine learning based methods, existing methods predict Alzheimer's disease genes by using structural magnetic resonance imaging (MRI) technique. Most methods have attained acceptable results, but the cost is expensive and time consuming. Thus, we proposed a computational method for identifying Alzheimer disease genes by use of the sequence information of proteins, and classify the feature vectors by random forest. In the proposed method, the gene protein information is extracted by adaptive k-skip-n-gram features. The proposed method can attain the accuracy to 85.5% on the selected UniProt dataset, which has been demonstrated by the experimental results. Frontiers Media S.A. 2019-02-12 /pmc/articles/PMC6379451/ /pubmed/30809242 http://dx.doi.org/10.3389/fgene.2019.00033 Text en Copyright © 2019 Xu, Liang, Liao, Chen and Chang. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Genetics Xu, Lei Liang, Guangmin Liao, Changrui Chen, Gin-Den Chang, Chi-Chang k-Skip-n-Gram-RF: A Random Forest Based Method for Alzheimer's Disease Protein Identification |
title | k-Skip-n-Gram-RF: A Random Forest Based Method for Alzheimer's Disease Protein Identification |
title_full | k-Skip-n-Gram-RF: A Random Forest Based Method for Alzheimer's Disease Protein Identification |
title_fullStr | k-Skip-n-Gram-RF: A Random Forest Based Method for Alzheimer's Disease Protein Identification |
title_full_unstemmed | k-Skip-n-Gram-RF: A Random Forest Based Method for Alzheimer's Disease Protein Identification |
title_short | k-Skip-n-Gram-RF: A Random Forest Based Method for Alzheimer's Disease Protein Identification |
title_sort | k-skip-n-gram-rf: a random forest based method for alzheimer's disease protein identification |
topic | Genetics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6379451/ https://www.ncbi.nlm.nih.gov/pubmed/30809242 http://dx.doi.org/10.3389/fgene.2019.00033 |
work_keys_str_mv | AT xulei kskipngramrfarandomforestbasedmethodforalzheimersdiseaseproteinidentification AT liangguangmin kskipngramrfarandomforestbasedmethodforalzheimersdiseaseproteinidentification AT liaochangrui kskipngramrfarandomforestbasedmethodforalzheimersdiseaseproteinidentification AT chenginden kskipngramrfarandomforestbasedmethodforalzheimersdiseaseproteinidentification AT changchichang kskipngramrfarandomforestbasedmethodforalzheimersdiseaseproteinidentification |