Cargando…

A comparative study of different variable selection methods based on numerical simulation and empirical analysis

This study employs the principles of computer science and statistics to evaluate the efficacy of the linear random effect model, utilizing Lasso variable selection techniques (including Lasso, Elastic-Net, Adaptive-Lasso, and SCAD) through numerical simulation and empirical research. The analysis fo...

Descripción completa

Detalles Bibliográficos
Autores principales: Hou, Dake, Zhou, Wenli, Zhang, Qiuxia, Zhang, Kun, Fang, Jiaqi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: PeerJ Inc. 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10495967/
https://www.ncbi.nlm.nih.gov/pubmed/37705642
http://dx.doi.org/10.7717/peerj-cs.1522
_version_ 1785105007097413632
author Hou, Dake
Zhou, Wenli
Zhang, Qiuxia
Zhang, Kun
Fang, Jiaqi
author_facet Hou, Dake
Zhou, Wenli
Zhang, Qiuxia
Zhang, Kun
Fang, Jiaqi
author_sort Hou, Dake
collection PubMed
description This study employs the principles of computer science and statistics to evaluate the efficacy of the linear random effect model, utilizing Lasso variable selection techniques (including Lasso, Elastic-Net, Adaptive-Lasso, and SCAD) through numerical simulation and empirical research. The analysis focuses on the model’s consistency in variable selection, prediction accuracy, stability, and efficiency. This study employs a novel approach to assess the consistency of variable selection across models. Specifically, the angle between the actual coefficient vector β and the estimated coefficient vector [Image: see text] is computed to determine the degree of consistency. Additionally, the boxplot tool of statistical analysis is utilized to visually represent the distribution of model prediction accuracy data and variable selection consistency. The comparative stability of each model is assessed based on the frequency of outliers. This study conducts comparative experiments of numerical simulation to evaluate a proposed model evaluation method against commonly used analysis methods. The results demonstrate the effectiveness and correctness of the proposed method, highlighting its ability to conveniently analyze the stability and efficiency of each fitting model.
format Online
Article
Text
id pubmed-10495967
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher PeerJ Inc.
record_format MEDLINE/PubMed
spelling pubmed-104959672023-09-13 A comparative study of different variable selection methods based on numerical simulation and empirical analysis Hou, Dake Zhou, Wenli Zhang, Qiuxia Zhang, Kun Fang, Jiaqi PeerJ Comput Sci Data Science This study employs the principles of computer science and statistics to evaluate the efficacy of the linear random effect model, utilizing Lasso variable selection techniques (including Lasso, Elastic-Net, Adaptive-Lasso, and SCAD) through numerical simulation and empirical research. The analysis focuses on the model’s consistency in variable selection, prediction accuracy, stability, and efficiency. This study employs a novel approach to assess the consistency of variable selection across models. Specifically, the angle between the actual coefficient vector β and the estimated coefficient vector [Image: see text] is computed to determine the degree of consistency. Additionally, the boxplot tool of statistical analysis is utilized to visually represent the distribution of model prediction accuracy data and variable selection consistency. The comparative stability of each model is assessed based on the frequency of outliers. This study conducts comparative experiments of numerical simulation to evaluate a proposed model evaluation method against commonly used analysis methods. The results demonstrate the effectiveness and correctness of the proposed method, highlighting its ability to conveniently analyze the stability and efficiency of each fitting model. PeerJ Inc. 2023-08-16 /pmc/articles/PMC10495967/ /pubmed/37705642 http://dx.doi.org/10.7717/peerj-cs.1522 Text en ©2023 Hou et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Computer Science) and either DOI or URL of the article must be cited.
spellingShingle Data Science
Hou, Dake
Zhou, Wenli
Zhang, Qiuxia
Zhang, Kun
Fang, Jiaqi
A comparative study of different variable selection methods based on numerical simulation and empirical analysis
title A comparative study of different variable selection methods based on numerical simulation and empirical analysis
title_full A comparative study of different variable selection methods based on numerical simulation and empirical analysis
title_fullStr A comparative study of different variable selection methods based on numerical simulation and empirical analysis
title_full_unstemmed A comparative study of different variable selection methods based on numerical simulation and empirical analysis
title_short A comparative study of different variable selection methods based on numerical simulation and empirical analysis
title_sort comparative study of different variable selection methods based on numerical simulation and empirical analysis
topic Data Science
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10495967/
https://www.ncbi.nlm.nih.gov/pubmed/37705642
http://dx.doi.org/10.7717/peerj-cs.1522
work_keys_str_mv AT houdake acomparativestudyofdifferentvariableselectionmethodsbasedonnumericalsimulationandempiricalanalysis
AT zhouwenli acomparativestudyofdifferentvariableselectionmethodsbasedonnumericalsimulationandempiricalanalysis
AT zhangqiuxia acomparativestudyofdifferentvariableselectionmethodsbasedonnumericalsimulationandempiricalanalysis
AT zhangkun acomparativestudyofdifferentvariableselectionmethodsbasedonnumericalsimulationandempiricalanalysis
AT fangjiaqi acomparativestudyofdifferentvariableselectionmethodsbasedonnumericalsimulationandempiricalanalysis
AT houdake comparativestudyofdifferentvariableselectionmethodsbasedonnumericalsimulationandempiricalanalysis
AT zhouwenli comparativestudyofdifferentvariableselectionmethodsbasedonnumericalsimulationandempiricalanalysis
AT zhangqiuxia comparativestudyofdifferentvariableselectionmethodsbasedonnumericalsimulationandempiricalanalysis
AT zhangkun comparativestudyofdifferentvariableselectionmethodsbasedonnumericalsimulationandempiricalanalysis
AT fangjiaqi comparativestudyofdifferentvariableselectionmethodsbasedonnumericalsimulationandempiricalanalysis