Cargando…

A review on longitudinal data analysis with random forest

In longitudinal studies variables are measured repeatedly over time, leading to clustered and correlated observations. If the goal of the study is to develop prediction models, machine learning approaches such as the powerful random forest (RF) are often promising alternatives to standard statistica...

Descripción completa

Detalles Bibliográficos
Autores principales: Hu, Jianchang, Szymczak, Silke
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10025446/
https://www.ncbi.nlm.nih.gov/pubmed/36653905
http://dx.doi.org/10.1093/bib/bbad002
_version_ 1784909332966539264
author Hu, Jianchang
Szymczak, Silke
author_facet Hu, Jianchang
Szymczak, Silke
author_sort Hu, Jianchang
collection PubMed
description In longitudinal studies variables are measured repeatedly over time, leading to clustered and correlated observations. If the goal of the study is to develop prediction models, machine learning approaches such as the powerful random forest (RF) are often promising alternatives to standard statistical methods, especially in the context of high-dimensional data. In this paper, we review extensions of the standard RF method for the purpose of longitudinal data analysis. Extension methods are categorized according to the data structures for which they are designed. We consider both univariate and multivariate response longitudinal data and further categorize the repeated measurements according to whether the time effect is relevant. Even though most extensions are proposed for low-dimensional data, some can be applied to high-dimensional data. Information of available software implementations of the reviewed extensions is also given. We conclude with discussions on the limitations of our review and some future research directions.
format Online
Article
Text
id pubmed-10025446
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-100254462023-03-21 A review on longitudinal data analysis with random forest Hu, Jianchang Szymczak, Silke Brief Bioinform Review In longitudinal studies variables are measured repeatedly over time, leading to clustered and correlated observations. If the goal of the study is to develop prediction models, machine learning approaches such as the powerful random forest (RF) are often promising alternatives to standard statistical methods, especially in the context of high-dimensional data. In this paper, we review extensions of the standard RF method for the purpose of longitudinal data analysis. Extension methods are categorized according to the data structures for which they are designed. We consider both univariate and multivariate response longitudinal data and further categorize the repeated measurements according to whether the time effect is relevant. Even though most extensions are proposed for low-dimensional data, some can be applied to high-dimensional data. Information of available software implementations of the reviewed extensions is also given. We conclude with discussions on the limitations of our review and some future research directions. Oxford University Press 2023-01-18 /pmc/articles/PMC10025446/ /pubmed/36653905 http://dx.doi.org/10.1093/bib/bbad002 Text en © The Author(s) 2023. Published by Oxford University Press. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Review
Hu, Jianchang
Szymczak, Silke
A review on longitudinal data analysis with random forest
title A review on longitudinal data analysis with random forest
title_full A review on longitudinal data analysis with random forest
title_fullStr A review on longitudinal data analysis with random forest
title_full_unstemmed A review on longitudinal data analysis with random forest
title_short A review on longitudinal data analysis with random forest
title_sort review on longitudinal data analysis with random forest
topic Review
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10025446/
https://www.ncbi.nlm.nih.gov/pubmed/36653905
http://dx.doi.org/10.1093/bib/bbad002
work_keys_str_mv AT hujianchang areviewonlongitudinaldataanalysiswithrandomforest
AT szymczaksilke areviewonlongitudinaldataanalysiswithrandomforest
AT hujianchang reviewonlongitudinaldataanalysiswithrandomforest
AT szymczaksilke reviewonlongitudinaldataanalysiswithrandomforest