Cargando…
A two-step method for variable selection in the analysis of a case-cohort study
BACKGROUND: Accurate detection and estimation of true exposure-outcome associations is important in aetiological analysis; when there are multiple potential exposure variables of interest, methods for detecting the subset of variables most likely to have true associations with the outcome of interes...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5913627/ https://www.ncbi.nlm.nih.gov/pubmed/29136145 http://dx.doi.org/10.1093/ije/dyx224 |
_version_ | 1783316575291441152 |
---|---|
author | Newcombe, P J Connolly, S Seaman, S Richardson, S Sharp, S J |
author_facet | Newcombe, P J Connolly, S Seaman, S Richardson, S Sharp, S J |
author_sort | Newcombe, P J |
collection | PubMed |
description | BACKGROUND: Accurate detection and estimation of true exposure-outcome associations is important in aetiological analysis; when there are multiple potential exposure variables of interest, methods for detecting the subset of variables most likely to have true associations with the outcome of interest are required. Case-cohort studies often collect data on a large number of variables which have not been measured in the entire cohort (e.g. panels of biomarkers). There is a lack of guidance on methods for variable selection in case-cohort studies. METHODS: We describe and explore the application of three variable selection methods to data from a case-cohort study. These are: (i) selecting variables based on their level of significance in univariable (i.e. one-at-a-time) Prentice-weighted Cox regression models; (ii) stepwise selection applied to Prentice-weighted Cox regression; and (iii) a two-step method which applies a Bayesian variable selection algorithm to obtain posterior probabilities of selection for each variable using multivariable logistic regression followed by effect estimation using Prentice-weighted Cox regression. RESULTS: Across nine different simulation scenarios, the two-step method demonstrated higher sensitivity and lower false discovery rate than the one-at-a-time and stepwise methods. In an application of the methods to data from the EPIC-InterAct case-cohort study, the two-step method identified an additional two fatty acids as being associated with incident type 2 diabetes, compared with the one-at-a-time and stepwise methods. CONCLUSIONS: The two-step method enables more powerful and accurate detection of exposure-outcome associations in case-cohort studies. An R package is available to enable researchers to apply this method. |
format | Online Article Text |
id | pubmed-5913627 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-59136272018-04-30 A two-step method for variable selection in the analysis of a case-cohort study Newcombe, P J Connolly, S Seaman, S Richardson, S Sharp, S J Int J Epidemiol Methods BACKGROUND: Accurate detection and estimation of true exposure-outcome associations is important in aetiological analysis; when there are multiple potential exposure variables of interest, methods for detecting the subset of variables most likely to have true associations with the outcome of interest are required. Case-cohort studies often collect data on a large number of variables which have not been measured in the entire cohort (e.g. panels of biomarkers). There is a lack of guidance on methods for variable selection in case-cohort studies. METHODS: We describe and explore the application of three variable selection methods to data from a case-cohort study. These are: (i) selecting variables based on their level of significance in univariable (i.e. one-at-a-time) Prentice-weighted Cox regression models; (ii) stepwise selection applied to Prentice-weighted Cox regression; and (iii) a two-step method which applies a Bayesian variable selection algorithm to obtain posterior probabilities of selection for each variable using multivariable logistic regression followed by effect estimation using Prentice-weighted Cox regression. RESULTS: Across nine different simulation scenarios, the two-step method demonstrated higher sensitivity and lower false discovery rate than the one-at-a-time and stepwise methods. In an application of the methods to data from the EPIC-InterAct case-cohort study, the two-step method identified an additional two fatty acids as being associated with incident type 2 diabetes, compared with the one-at-a-time and stepwise methods. CONCLUSIONS: The two-step method enables more powerful and accurate detection of exposure-outcome associations in case-cohort studies. An R package is available to enable researchers to apply this method. Oxford University Press 2018-04 2017-11-10 /pmc/articles/PMC5913627/ /pubmed/29136145 http://dx.doi.org/10.1093/ije/dyx224 Text en © The Author 2017. Published by Oxford University Press on behalf of the International Epidemiological Association. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Methods Newcombe, P J Connolly, S Seaman, S Richardson, S Sharp, S J A two-step method for variable selection in the analysis of a case-cohort study |
title | A two-step method for variable selection in the analysis of a case-cohort study |
title_full | A two-step method for variable selection in the analysis of a case-cohort study |
title_fullStr | A two-step method for variable selection in the analysis of a case-cohort study |
title_full_unstemmed | A two-step method for variable selection in the analysis of a case-cohort study |
title_short | A two-step method for variable selection in the analysis of a case-cohort study |
title_sort | two-step method for variable selection in the analysis of a case-cohort study |
topic | Methods |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5913627/ https://www.ncbi.nlm.nih.gov/pubmed/29136145 http://dx.doi.org/10.1093/ije/dyx224 |
work_keys_str_mv | AT newcombepj atwostepmethodforvariableselectionintheanalysisofacasecohortstudy AT connollys atwostepmethodforvariableselectionintheanalysisofacasecohortstudy AT seamans atwostepmethodforvariableselectionintheanalysisofacasecohortstudy AT richardsons atwostepmethodforvariableselectionintheanalysisofacasecohortstudy AT sharpsj atwostepmethodforvariableselectionintheanalysisofacasecohortstudy AT newcombepj twostepmethodforvariableselectionintheanalysisofacasecohortstudy AT connollys twostepmethodforvariableselectionintheanalysisofacasecohortstudy AT seamans twostepmethodforvariableselectionintheanalysisofacasecohortstudy AT richardsons twostepmethodforvariableselectionintheanalysisofacasecohortstudy AT sharpsj twostepmethodforvariableselectionintheanalysisofacasecohortstudy |