Cargando…

Identification of genes and haplotypes that predict rheumatoid arthritis using random forests

Random forest (RF) analysis of genetic data does not require specification of the mode of inheritance, and provides measures of variable importance that incorporate interaction effects. In this paper we describe RF-based approaches for assessment of gene and haplotype importance, and apply these app...

Descripción completa

Detalles Bibliográficos
Autores principales: Tang, Rui, Sinnwell, Jason P, Li, Jia, Rider, David N, de Andrade, Mariza, Biernacka, Joanna M
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2795969/
https://www.ncbi.nlm.nih.gov/pubmed/20018062
Descripción
Sumario:Random forest (RF) analysis of genetic data does not require specification of the mode of inheritance, and provides measures of variable importance that incorporate interaction effects. In this paper we describe RF-based approaches for assessment of gene and haplotype importance, and apply these approaches to a subset of the North American Rheumatoid Arthritis Consortium case-control data provided by Genetic Analysis Workshop 16. The RF analyses of 37 genes identified many of the same genes as logistic regression, but also suggested importance of certain single-nucleotide polymorphism and genes that were not ranked highly by logistic regression. A new permutation method did not reveal strong evidence of gene-gene interaction effects in these data. Although RFs are a promising approach for genetic data analysis, extensions beyond simple single-nucleotide polymorphism analyses and modifications to improve computational feasibility are needed.