Cargando…

Statistical learning techniques applied to epidemiology: a simulated case-control comparison study with logistic regression

BACKGROUND: When investigating covariate interactions and group associations with standard regression analyses, the relationship between the response variable and exposure may be difficult to characterize. When the relationship is nonlinear, linear modeling techniques do not capture the nonlinear in...

Descripción completa

Detalles Bibliográficos
Autores principales: Heine, John J, Land, Walker H, Egan, Kathleen M
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3045299/
https://www.ncbi.nlm.nih.gov/pubmed/21272346
http://dx.doi.org/10.1186/1471-2105-12-37
_version_ 1782198805697921024
author Heine, John J
Land, Walker H
Egan, Kathleen M
author_facet Heine, John J
Land, Walker H
Egan, Kathleen M
author_sort Heine, John J
collection PubMed
description BACKGROUND: When investigating covariate interactions and group associations with standard regression analyses, the relationship between the response variable and exposure may be difficult to characterize. When the relationship is nonlinear, linear modeling techniques do not capture the nonlinear information content. Statistical learning (SL) techniques with kernels are capable of addressing nonlinear problems without making parametric assumptions. However, these techniques do not produce findings relevant for epidemiologic interpretations. A simulated case-control study was used to contrast the information embedding characteristics and separation boundaries produced by a specific SL technique with logistic regression (LR) modeling representing a parametric approach. The SL technique was comprised of a kernel mapping in combination with a perceptron neural network. Because the LR model has an important epidemiologic interpretation, the SL method was modified to produce the analogous interpretation and generate odds ratios for comparison. RESULTS: The SL approach is capable of generating odds ratios for main effects and risk factor interactions that better capture nonlinear relationships between exposure variables and outcome in comparison with LR. CONCLUSIONS: The integration of SL methods in epidemiology may improve both the understanding and interpretation of complex exposure/disease relationships.
format Text
id pubmed-3045299
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-30452992011-03-01 Statistical learning techniques applied to epidemiology: a simulated case-control comparison study with logistic regression Heine, John J Land, Walker H Egan, Kathleen M BMC Bioinformatics Research Article BACKGROUND: When investigating covariate interactions and group associations with standard regression analyses, the relationship between the response variable and exposure may be difficult to characterize. When the relationship is nonlinear, linear modeling techniques do not capture the nonlinear information content. Statistical learning (SL) techniques with kernels are capable of addressing nonlinear problems without making parametric assumptions. However, these techniques do not produce findings relevant for epidemiologic interpretations. A simulated case-control study was used to contrast the information embedding characteristics and separation boundaries produced by a specific SL technique with logistic regression (LR) modeling representing a parametric approach. The SL technique was comprised of a kernel mapping in combination with a perceptron neural network. Because the LR model has an important epidemiologic interpretation, the SL method was modified to produce the analogous interpretation and generate odds ratios for comparison. RESULTS: The SL approach is capable of generating odds ratios for main effects and risk factor interactions that better capture nonlinear relationships between exposure variables and outcome in comparison with LR. CONCLUSIONS: The integration of SL methods in epidemiology may improve both the understanding and interpretation of complex exposure/disease relationships. BioMed Central 2011-01-27 /pmc/articles/PMC3045299/ /pubmed/21272346 http://dx.doi.org/10.1186/1471-2105-12-37 Text en Copyright ©2011 Heine et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Heine, John J
Land, Walker H
Egan, Kathleen M
Statistical learning techniques applied to epidemiology: a simulated case-control comparison study with logistic regression
title Statistical learning techniques applied to epidemiology: a simulated case-control comparison study with logistic regression
title_full Statistical learning techniques applied to epidemiology: a simulated case-control comparison study with logistic regression
title_fullStr Statistical learning techniques applied to epidemiology: a simulated case-control comparison study with logistic regression
title_full_unstemmed Statistical learning techniques applied to epidemiology: a simulated case-control comparison study with logistic regression
title_short Statistical learning techniques applied to epidemiology: a simulated case-control comparison study with logistic regression
title_sort statistical learning techniques applied to epidemiology: a simulated case-control comparison study with logistic regression
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3045299/
https://www.ncbi.nlm.nih.gov/pubmed/21272346
http://dx.doi.org/10.1186/1471-2105-12-37
work_keys_str_mv AT heinejohnj statisticallearningtechniquesappliedtoepidemiologyasimulatedcasecontrolcomparisonstudywithlogisticregression
AT landwalkerh statisticallearningtechniquesappliedtoepidemiologyasimulatedcasecontrolcomparisonstudywithlogisticregression
AT egankathleenm statisticallearningtechniquesappliedtoepidemiologyasimulatedcasecontrolcomparisonstudywithlogisticregression