Cargando…

Improved Variable Selection Algorithm Using a LASSO-Type Penalty, with an Application to Assessing Hepatitis B Infection Relevant Factors in Community Residents

OBJECTIVES: In epidemiological studies, it is important to identify independent associations between collective exposures and a health outcome. The current stepwise selection technique ignores stochastic errors and suffers from a lack of stability. The alternative LASSO-penalized regression model ca...

Descripción completa

Detalles Bibliográficos
Autores principales: Guo, Pi, Zeng, Fangfang, Hu, Xiaomin, Zhang, Dingmei, Zhu, Shuming, Deng, Yu, Hao, Yuantao
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4516242/
https://www.ncbi.nlm.nih.gov/pubmed/26214802
http://dx.doi.org/10.1371/journal.pone.0134151
_version_ 1782383034590298112
author Guo, Pi
Zeng, Fangfang
Hu, Xiaomin
Zhang, Dingmei
Zhu, Shuming
Deng, Yu
Hao, Yuantao
author_facet Guo, Pi
Zeng, Fangfang
Hu, Xiaomin
Zhang, Dingmei
Zhu, Shuming
Deng, Yu
Hao, Yuantao
author_sort Guo, Pi
collection PubMed
description OBJECTIVES: In epidemiological studies, it is important to identify independent associations between collective exposures and a health outcome. The current stepwise selection technique ignores stochastic errors and suffers from a lack of stability. The alternative LASSO-penalized regression model can be applied to detect significant predictors from a pool of candidate variables. However, this technique is prone to false positives and tends to create excessive biases. It remains challenging to develop robust variable selection methods and enhance predictability. MATERIAL AND METHODS: Two improved algorithms denoted the two-stage hybrid and bootstrap ranking procedures, both using a LASSO-type penalty, were developed for epidemiological association analysis. The performance of the proposed procedures and other methods including conventional LASSO, Bolasso, stepwise and stability selection models were evaluated using intensive simulation. In addition, methods were compared by using an empirical analysis based on large-scale survey data of hepatitis B infection-relevant factors among Guangdong residents. RESULTS: The proposed procedures produced comparable or less biased selection results when compared to conventional variable selection models. In total, the two newly proposed procedures were stable with respect to various scenarios of simulation, demonstrating a higher power and a lower false positive rate during variable selection than the compared methods. In empirical analysis, the proposed procedures yielding a sparse set of hepatitis B infection-relevant factors gave the best predictive performance and showed that the procedures were able to select a more stringent set of factors. The individual history of hepatitis B vaccination, family and individual history of hepatitis B infection were associated with hepatitis B infection in the studied residents according to the proposed procedures. CONCLUSIONS: The newly proposed procedures improve the identification of significant variables and enable us to derive a new insight into epidemiological association analysis.
format Online
Article
Text
id pubmed-4516242
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-45162422015-07-29 Improved Variable Selection Algorithm Using a LASSO-Type Penalty, with an Application to Assessing Hepatitis B Infection Relevant Factors in Community Residents Guo, Pi Zeng, Fangfang Hu, Xiaomin Zhang, Dingmei Zhu, Shuming Deng, Yu Hao, Yuantao PLoS One Research Article OBJECTIVES: In epidemiological studies, it is important to identify independent associations between collective exposures and a health outcome. The current stepwise selection technique ignores stochastic errors and suffers from a lack of stability. The alternative LASSO-penalized regression model can be applied to detect significant predictors from a pool of candidate variables. However, this technique is prone to false positives and tends to create excessive biases. It remains challenging to develop robust variable selection methods and enhance predictability. MATERIAL AND METHODS: Two improved algorithms denoted the two-stage hybrid and bootstrap ranking procedures, both using a LASSO-type penalty, were developed for epidemiological association analysis. The performance of the proposed procedures and other methods including conventional LASSO, Bolasso, stepwise and stability selection models were evaluated using intensive simulation. In addition, methods were compared by using an empirical analysis based on large-scale survey data of hepatitis B infection-relevant factors among Guangdong residents. RESULTS: The proposed procedures produced comparable or less biased selection results when compared to conventional variable selection models. In total, the two newly proposed procedures were stable with respect to various scenarios of simulation, demonstrating a higher power and a lower false positive rate during variable selection than the compared methods. In empirical analysis, the proposed procedures yielding a sparse set of hepatitis B infection-relevant factors gave the best predictive performance and showed that the procedures were able to select a more stringent set of factors. The individual history of hepatitis B vaccination, family and individual history of hepatitis B infection were associated with hepatitis B infection in the studied residents according to the proposed procedures. CONCLUSIONS: The newly proposed procedures improve the identification of significant variables and enable us to derive a new insight into epidemiological association analysis. Public Library of Science 2015-07-27 /pmc/articles/PMC4516242/ /pubmed/26214802 http://dx.doi.org/10.1371/journal.pone.0134151 Text en © 2015 Guo et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Guo, Pi
Zeng, Fangfang
Hu, Xiaomin
Zhang, Dingmei
Zhu, Shuming
Deng, Yu
Hao, Yuantao
Improved Variable Selection Algorithm Using a LASSO-Type Penalty, with an Application to Assessing Hepatitis B Infection Relevant Factors in Community Residents
title Improved Variable Selection Algorithm Using a LASSO-Type Penalty, with an Application to Assessing Hepatitis B Infection Relevant Factors in Community Residents
title_full Improved Variable Selection Algorithm Using a LASSO-Type Penalty, with an Application to Assessing Hepatitis B Infection Relevant Factors in Community Residents
title_fullStr Improved Variable Selection Algorithm Using a LASSO-Type Penalty, with an Application to Assessing Hepatitis B Infection Relevant Factors in Community Residents
title_full_unstemmed Improved Variable Selection Algorithm Using a LASSO-Type Penalty, with an Application to Assessing Hepatitis B Infection Relevant Factors in Community Residents
title_short Improved Variable Selection Algorithm Using a LASSO-Type Penalty, with an Application to Assessing Hepatitis B Infection Relevant Factors in Community Residents
title_sort improved variable selection algorithm using a lasso-type penalty, with an application to assessing hepatitis b infection relevant factors in community residents
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4516242/
https://www.ncbi.nlm.nih.gov/pubmed/26214802
http://dx.doi.org/10.1371/journal.pone.0134151
work_keys_str_mv AT guopi improvedvariableselectionalgorithmusingalassotypepenaltywithanapplicationtoassessinghepatitisbinfectionrelevantfactorsincommunityresidents
AT zengfangfang improvedvariableselectionalgorithmusingalassotypepenaltywithanapplicationtoassessinghepatitisbinfectionrelevantfactorsincommunityresidents
AT huxiaomin improvedvariableselectionalgorithmusingalassotypepenaltywithanapplicationtoassessinghepatitisbinfectionrelevantfactorsincommunityresidents
AT zhangdingmei improvedvariableselectionalgorithmusingalassotypepenaltywithanapplicationtoassessinghepatitisbinfectionrelevantfactorsincommunityresidents
AT zhushuming improvedvariableselectionalgorithmusingalassotypepenaltywithanapplicationtoassessinghepatitisbinfectionrelevantfactorsincommunityresidents
AT dengyu improvedvariableselectionalgorithmusingalassotypepenaltywithanapplicationtoassessinghepatitisbinfectionrelevantfactorsincommunityresidents
AT haoyuantao improvedvariableselectionalgorithmusingalassotypepenaltywithanapplicationtoassessinghepatitisbinfectionrelevantfactorsincommunityresidents