Cargando…

LBoost: A Boosting Algorithm with Application for Epistasis Discovery

Many human diseases are attributable to complex interactions among genetic and environmental factors. Statistical tools capable of modeling such complex interactions are necessary to improve identification of genetic factors that increase a patient's risk of disease. Logic Forest (LF), a baggin...

Descripción completa

Detalles Bibliográficos
Autores principales: Wolf, Bethany J., Hill, Elizabeth G., Slate, Elizabeth H., Neumann, Carola A., Kistner-Griffin, Emily
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3493573/
https://www.ncbi.nlm.nih.gov/pubmed/23144812
http://dx.doi.org/10.1371/journal.pone.0047281
_version_ 1782249288805384192
author Wolf, Bethany J.
Hill, Elizabeth G.
Slate, Elizabeth H.
Neumann, Carola A.
Kistner-Griffin, Emily
author_facet Wolf, Bethany J.
Hill, Elizabeth G.
Slate, Elizabeth H.
Neumann, Carola A.
Kistner-Griffin, Emily
author_sort Wolf, Bethany J.
collection PubMed
description Many human diseases are attributable to complex interactions among genetic and environmental factors. Statistical tools capable of modeling such complex interactions are necessary to improve identification of genetic factors that increase a patient's risk of disease. Logic Forest (LF), a bagging ensemble algorithm based on logic regression (LR), is able to discover interactions among binary variables predictive of response such as the biologic interactions that predispose individuals to disease. However, LF's ability to recover interactions degrades for more infrequently occurring interactions. A rare genetic interaction may occur if, for example, the interaction increases disease risk in a patient subpopulation that represents only a small proportion of the overall patient population. We present an alternative ensemble adaptation of LR based on boosting rather than bagging called LBoost. We compare the ability of LBoost and LF to identify variable interactions in simulation studies. Results indicate that LBoost is superior to LF for identifying genetic interactions associated with disease that are infrequent in the population. We apply LBoost to a subset of single nucleotide polymorphisms on the PRDX genes from the Cancer Genetic Markers of Susceptibility Breast Cancer Scan to investigate genetic risk for breast cancer. LBoost is publicly available on CRAN as part of the LogicForest package, http://cran.r-project.org/.
format Online
Article
Text
id pubmed-3493573
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-34935732012-11-09 LBoost: A Boosting Algorithm with Application for Epistasis Discovery Wolf, Bethany J. Hill, Elizabeth G. Slate, Elizabeth H. Neumann, Carola A. Kistner-Griffin, Emily PLoS One Research Article Many human diseases are attributable to complex interactions among genetic and environmental factors. Statistical tools capable of modeling such complex interactions are necessary to improve identification of genetic factors that increase a patient's risk of disease. Logic Forest (LF), a bagging ensemble algorithm based on logic regression (LR), is able to discover interactions among binary variables predictive of response such as the biologic interactions that predispose individuals to disease. However, LF's ability to recover interactions degrades for more infrequently occurring interactions. A rare genetic interaction may occur if, for example, the interaction increases disease risk in a patient subpopulation that represents only a small proportion of the overall patient population. We present an alternative ensemble adaptation of LR based on boosting rather than bagging called LBoost. We compare the ability of LBoost and LF to identify variable interactions in simulation studies. Results indicate that LBoost is superior to LF for identifying genetic interactions associated with disease that are infrequent in the population. We apply LBoost to a subset of single nucleotide polymorphisms on the PRDX genes from the Cancer Genetic Markers of Susceptibility Breast Cancer Scan to investigate genetic risk for breast cancer. LBoost is publicly available on CRAN as part of the LogicForest package, http://cran.r-project.org/. Public Library of Science 2012-11-08 /pmc/articles/PMC3493573/ /pubmed/23144812 http://dx.doi.org/10.1371/journal.pone.0047281 Text en © 2012 Wolf et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Wolf, Bethany J.
Hill, Elizabeth G.
Slate, Elizabeth H.
Neumann, Carola A.
Kistner-Griffin, Emily
LBoost: A Boosting Algorithm with Application for Epistasis Discovery
title LBoost: A Boosting Algorithm with Application for Epistasis Discovery
title_full LBoost: A Boosting Algorithm with Application for Epistasis Discovery
title_fullStr LBoost: A Boosting Algorithm with Application for Epistasis Discovery
title_full_unstemmed LBoost: A Boosting Algorithm with Application for Epistasis Discovery
title_short LBoost: A Boosting Algorithm with Application for Epistasis Discovery
title_sort lboost: a boosting algorithm with application for epistasis discovery
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3493573/
https://www.ncbi.nlm.nih.gov/pubmed/23144812
http://dx.doi.org/10.1371/journal.pone.0047281
work_keys_str_mv AT wolfbethanyj lboostaboostingalgorithmwithapplicationforepistasisdiscovery
AT hillelizabethg lboostaboostingalgorithmwithapplicationforepistasisdiscovery
AT slateelizabethh lboostaboostingalgorithmwithapplicationforepistasisdiscovery
AT neumanncarolaa lboostaboostingalgorithmwithapplicationforepistasisdiscovery
AT kistnergriffinemily lboostaboostingalgorithmwithapplicationforepistasisdiscovery