Cargando…

A comparison of classification methods for predicting Chronic Fatigue Syndrome based on genetic data

BACKGROUND: In the studies of genomics, it is essential to select a small number of genes that are more significant than the others for the association studies of disease susceptibility. In this work, our goal was to compare computational tools with and without feature selection for predicting chron...

Descripción completa

Detalles Bibliográficos
Autores principales: Huang, Lung-Cheng, Hsu, Sen-Yen, Lin, Eugene
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2765429/
https://www.ncbi.nlm.nih.gov/pubmed/19772600
http://dx.doi.org/10.1186/1479-5876-7-81
_version_ 1782173147385036800
author Huang, Lung-Cheng
Hsu, Sen-Yen
Lin, Eugene
author_facet Huang, Lung-Cheng
Hsu, Sen-Yen
Lin, Eugene
author_sort Huang, Lung-Cheng
collection PubMed
description BACKGROUND: In the studies of genomics, it is essential to select a small number of genes that are more significant than the others for the association studies of disease susceptibility. In this work, our goal was to compare computational tools with and without feature selection for predicting chronic fatigue syndrome (CFS) using genetic factors such as single nucleotide polymorphisms (SNPs). METHODS: We employed the dataset that was original to the previous study by the CDC Chronic Fatigue Syndrome Research Group. To uncover relationships between CFS and SNPs, we applied three classification algorithms including naive Bayes, the support vector machine algorithm, and the C4.5 decision tree algorithm. Furthermore, we utilized feature selection methods to identify a subset of influential SNPs. One was the hybrid feature selection approach combining the chi-squared and information-gain methods. The other was the wrapper-based feature selection method. RESULTS: The naive Bayes model with the wrapper-based approach performed maximally among predictive models to infer the disease susceptibility dealing with the complex relationship between CFS and SNPs. CONCLUSION: We demonstrated that our approach is a promising method to assess the associations between CFS and SNPs.
format Text
id pubmed-2765429
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-27654292009-10-22 A comparison of classification methods for predicting Chronic Fatigue Syndrome based on genetic data Huang, Lung-Cheng Hsu, Sen-Yen Lin, Eugene J Transl Med Research BACKGROUND: In the studies of genomics, it is essential to select a small number of genes that are more significant than the others for the association studies of disease susceptibility. In this work, our goal was to compare computational tools with and without feature selection for predicting chronic fatigue syndrome (CFS) using genetic factors such as single nucleotide polymorphisms (SNPs). METHODS: We employed the dataset that was original to the previous study by the CDC Chronic Fatigue Syndrome Research Group. To uncover relationships between CFS and SNPs, we applied three classification algorithms including naive Bayes, the support vector machine algorithm, and the C4.5 decision tree algorithm. Furthermore, we utilized feature selection methods to identify a subset of influential SNPs. One was the hybrid feature selection approach combining the chi-squared and information-gain methods. The other was the wrapper-based feature selection method. RESULTS: The naive Bayes model with the wrapper-based approach performed maximally among predictive models to infer the disease susceptibility dealing with the complex relationship between CFS and SNPs. CONCLUSION: We demonstrated that our approach is a promising method to assess the associations between CFS and SNPs. BioMed Central 2009-09-22 /pmc/articles/PMC2765429/ /pubmed/19772600 http://dx.doi.org/10.1186/1479-5876-7-81 Text en Copyright © 2009 Huang et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Huang, Lung-Cheng
Hsu, Sen-Yen
Lin, Eugene
A comparison of classification methods for predicting Chronic Fatigue Syndrome based on genetic data
title A comparison of classification methods for predicting Chronic Fatigue Syndrome based on genetic data
title_full A comparison of classification methods for predicting Chronic Fatigue Syndrome based on genetic data
title_fullStr A comparison of classification methods for predicting Chronic Fatigue Syndrome based on genetic data
title_full_unstemmed A comparison of classification methods for predicting Chronic Fatigue Syndrome based on genetic data
title_short A comparison of classification methods for predicting Chronic Fatigue Syndrome based on genetic data
title_sort comparison of classification methods for predicting chronic fatigue syndrome based on genetic data
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2765429/
https://www.ncbi.nlm.nih.gov/pubmed/19772600
http://dx.doi.org/10.1186/1479-5876-7-81
work_keys_str_mv AT huanglungcheng acomparisonofclassificationmethodsforpredictingchronicfatiguesyndromebasedongeneticdata
AT hsusenyen acomparisonofclassificationmethodsforpredictingchronicfatiguesyndromebasedongeneticdata
AT lineugene acomparisonofclassificationmethodsforpredictingchronicfatiguesyndromebasedongeneticdata
AT huanglungcheng comparisonofclassificationmethodsforpredictingchronicfatiguesyndromebasedongeneticdata
AT hsusenyen comparisonofclassificationmethodsforpredictingchronicfatiguesyndromebasedongeneticdata
AT lineugene comparisonofclassificationmethodsforpredictingchronicfatiguesyndromebasedongeneticdata