Cargando…
Selection of important variables by statistical learning in genome-wide association analysis
Genetic analysis of complex diseases demands novel analytical methods to interpret data collected on thousands of variables by genome-wide association studies. The complexity of such analysis is multiplied when one has to consider interaction effects, be they among the genetic variations (G × G) or...
Autores principales: | , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2009
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2795972/ https://www.ncbi.nlm.nih.gov/pubmed/20018065 |
_version_ | 1782175482831175680 |
---|---|
author | Yang, Wei (Will) Gu, C Charles |
author_facet | Yang, Wei (Will) Gu, C Charles |
author_sort | Yang, Wei (Will) |
collection | PubMed |
description | Genetic analysis of complex diseases demands novel analytical methods to interpret data collected on thousands of variables by genome-wide association studies. The complexity of such analysis is multiplied when one has to consider interaction effects, be they among the genetic variations (G × G) or with environment risk factors (G × E). Several statistical learning methods seem quite promising in this context. Herein we consider applications of two such methods, random forest and Bayesian networks, to the simulated dataset for Genetic Analysis Workshop 16 Problem 3. Our evaluation study showed that an iterative search based on the random forest approach has the potential in selecting important variables, while Bayesian networks can capture some of the underlying causal relationships. |
format | Text |
id | pubmed-2795972 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2009 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-27959722009-12-18 Selection of important variables by statistical learning in genome-wide association analysis Yang, Wei (Will) Gu, C Charles BMC Proc Proceedings Genetic analysis of complex diseases demands novel analytical methods to interpret data collected on thousands of variables by genome-wide association studies. The complexity of such analysis is multiplied when one has to consider interaction effects, be they among the genetic variations (G × G) or with environment risk factors (G × E). Several statistical learning methods seem quite promising in this context. Herein we consider applications of two such methods, random forest and Bayesian networks, to the simulated dataset for Genetic Analysis Workshop 16 Problem 3. Our evaluation study showed that an iterative search based on the random forest approach has the potential in selecting important variables, while Bayesian networks can capture some of the underlying causal relationships. BioMed Central 2009-12-15 /pmc/articles/PMC2795972/ /pubmed/20018065 Text en Copyright ©2009 Yang and Gu; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Proceedings Yang, Wei (Will) Gu, C Charles Selection of important variables by statistical learning in genome-wide association analysis |
title | Selection of important variables by statistical learning in genome-wide association analysis |
title_full | Selection of important variables by statistical learning in genome-wide association analysis |
title_fullStr | Selection of important variables by statistical learning in genome-wide association analysis |
title_full_unstemmed | Selection of important variables by statistical learning in genome-wide association analysis |
title_short | Selection of important variables by statistical learning in genome-wide association analysis |
title_sort | selection of important variables by statistical learning in genome-wide association analysis |
topic | Proceedings |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2795972/ https://www.ncbi.nlm.nih.gov/pubmed/20018065 |
work_keys_str_mv | AT yangweiwill selectionofimportantvariablesbystatisticallearningingenomewideassociationanalysis AT guccharles selectionofimportantvariablesbystatisticallearningingenomewideassociationanalysis |