Cargando…

Mapping complex traits using Random Forests

Random Forest is a prediction technique based on growing trees on bootstrap samples of data, in conjunction with a random selection of explanatory variables to define the best split at each node. In the case of a quantitative outcome, the tree predictor takes on a numerical value. We applied Random...

Descripción completa

Detalles Bibliográficos
Autores principales: Bureau, Alexandre, Dupuis, Josée, Hayward, Brooke, Falls, Kathleen, Van Eerdewegh, Paul
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2003
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1866502/
https://www.ncbi.nlm.nih.gov/pubmed/14975132
http://dx.doi.org/10.1186/1471-2156-4-S1-S64
_version_ 1782133285900517376
author Bureau, Alexandre
Dupuis, Josée
Hayward, Brooke
Falls, Kathleen
Van Eerdewegh, Paul
author_facet Bureau, Alexandre
Dupuis, Josée
Hayward, Brooke
Falls, Kathleen
Van Eerdewegh, Paul
author_sort Bureau, Alexandre
collection PubMed
description Random Forest is a prediction technique based on growing trees on bootstrap samples of data, in conjunction with a random selection of explanatory variables to define the best split at each node. In the case of a quantitative outcome, the tree predictor takes on a numerical value. We applied Random Forest to the first replicate of the Genetic Analysis Workshop 13 simulated data set, with the sibling pairs as our units of analysis and identity by descent (IBD) at selected loci as our explanatory variables. With the knowledge of the true model, we performed two sets of analyses on three phenotypes: HDL, triglycerides, and glucose. The goal was to approach the mapping of complex traits from a multivariate perspective. The first set of analyses mimics a candidate gene approach with a high proportion of true genes among the predictors while the second set represents a genome scan analysis using microsatellite markers. Random Forest was able to identify a few of the major genes influencing the phenotypes, such as baseline HDL and triglycerides, but failed to identify the major genes regulating baseline glucose levels.
format Text
id pubmed-1866502
institution National Center for Biotechnology Information
language English
publishDate 2003
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-18665022007-05-11 Mapping complex traits using Random Forests Bureau, Alexandre Dupuis, Josée Hayward, Brooke Falls, Kathleen Van Eerdewegh, Paul BMC Genet Proceedings Random Forest is a prediction technique based on growing trees on bootstrap samples of data, in conjunction with a random selection of explanatory variables to define the best split at each node. In the case of a quantitative outcome, the tree predictor takes on a numerical value. We applied Random Forest to the first replicate of the Genetic Analysis Workshop 13 simulated data set, with the sibling pairs as our units of analysis and identity by descent (IBD) at selected loci as our explanatory variables. With the knowledge of the true model, we performed two sets of analyses on three phenotypes: HDL, triglycerides, and glucose. The goal was to approach the mapping of complex traits from a multivariate perspective. The first set of analyses mimics a candidate gene approach with a high proportion of true genes among the predictors while the second set represents a genome scan analysis using microsatellite markers. Random Forest was able to identify a few of the major genes influencing the phenotypes, such as baseline HDL and triglycerides, but failed to identify the major genes regulating baseline glucose levels. BioMed Central 2003-12-31 /pmc/articles/PMC1866502/ /pubmed/14975132 http://dx.doi.org/10.1186/1471-2156-4-S1-S64 Text en Copyright © 2003 Bureau et al; licensee BioMed Central Ltd http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Proceedings
Bureau, Alexandre
Dupuis, Josée
Hayward, Brooke
Falls, Kathleen
Van Eerdewegh, Paul
Mapping complex traits using Random Forests
title Mapping complex traits using Random Forests
title_full Mapping complex traits using Random Forests
title_fullStr Mapping complex traits using Random Forests
title_full_unstemmed Mapping complex traits using Random Forests
title_short Mapping complex traits using Random Forests
title_sort mapping complex traits using random forests
topic Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1866502/
https://www.ncbi.nlm.nih.gov/pubmed/14975132
http://dx.doi.org/10.1186/1471-2156-4-S1-S64
work_keys_str_mv AT bureaualexandre mappingcomplextraitsusingrandomforests
AT dupuisjosee mappingcomplextraitsusingrandomforests
AT haywardbrooke mappingcomplextraitsusingrandomforests
AT fallskathleen mappingcomplextraitsusingrandomforests
AT vaneerdeweghpaul mappingcomplextraitsusingrandomforests