Cargando…
Using random forests to model 90-day hometime in people with stroke
BACKGROUND: Ninety-day hometime, the number of days a patient is living in the community in the first 90 after stroke, exhibits a non-normal bucket-shaped distribution, with lower and upper constraints making its analysis difficult. In this proof-of-concept study we evaluated the performance of rand...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8112132/ https://www.ncbi.nlm.nih.gov/pubmed/33971827 http://dx.doi.org/10.1186/s12874-021-01289-8 |
_version_ | 1783690630197673984 |
---|---|
author | Holodinsky, Jessalyn K. Yu, Amy Y. X. Kapral, Moira K. Austin, Peter C. |
author_facet | Holodinsky, Jessalyn K. Yu, Amy Y. X. Kapral, Moira K. Austin, Peter C. |
author_sort | Holodinsky, Jessalyn K. |
collection | PubMed |
description | BACKGROUND: Ninety-day hometime, the number of days a patient is living in the community in the first 90 after stroke, exhibits a non-normal bucket-shaped distribution, with lower and upper constraints making its analysis difficult. In this proof-of-concept study we evaluated the performance of random forests regression in the analysis of hometime. METHODS: Using administrative data we identified stroke hospitalizations between 2010 and 2017 in Ontario, Canada. We used random forests regression to predict 90-day hometime using 15 covariates. Model accuracy was determined using the r-squared statistic. Variable importance in prediction and the marginal effects of each covariate were explored. RESULTS: We identified 75,745 eligible patients. Median 90-day hometime was 59 days (Q1: 2, Q3: 83). Random forests predicted hometime with reasonable accuracy (adjusted r-squared 0.3462); no implausible values were predicted but extreme values were predicted with low accuracy. Frailty, stroke severity, and age exhibited inverse non-linear relationships with hometime and patients arriving by ambulance had less hometime than those who did not. CONCLUSIONS: Random forests may be a useful method for analyzing 90-day hometime and capturing the complex non-linear relationships which exist between predictors and hometime. Future work should compare random forests to other models and focus on improving the accuracy of predictions of extreme values of hometime. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12874-021-01289-8. |
format | Online Article Text |
id | pubmed-8112132 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-81121322021-05-12 Using random forests to model 90-day hometime in people with stroke Holodinsky, Jessalyn K. Yu, Amy Y. X. Kapral, Moira K. Austin, Peter C. BMC Med Res Methodol Research Article BACKGROUND: Ninety-day hometime, the number of days a patient is living in the community in the first 90 after stroke, exhibits a non-normal bucket-shaped distribution, with lower and upper constraints making its analysis difficult. In this proof-of-concept study we evaluated the performance of random forests regression in the analysis of hometime. METHODS: Using administrative data we identified stroke hospitalizations between 2010 and 2017 in Ontario, Canada. We used random forests regression to predict 90-day hometime using 15 covariates. Model accuracy was determined using the r-squared statistic. Variable importance in prediction and the marginal effects of each covariate were explored. RESULTS: We identified 75,745 eligible patients. Median 90-day hometime was 59 days (Q1: 2, Q3: 83). Random forests predicted hometime with reasonable accuracy (adjusted r-squared 0.3462); no implausible values were predicted but extreme values were predicted with low accuracy. Frailty, stroke severity, and age exhibited inverse non-linear relationships with hometime and patients arriving by ambulance had less hometime than those who did not. CONCLUSIONS: Random forests may be a useful method for analyzing 90-day hometime and capturing the complex non-linear relationships which exist between predictors and hometime. Future work should compare random forests to other models and focus on improving the accuracy of predictions of extreme values of hometime. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12874-021-01289-8. BioMed Central 2021-05-10 /pmc/articles/PMC8112132/ /pubmed/33971827 http://dx.doi.org/10.1186/s12874-021-01289-8 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Article Holodinsky, Jessalyn K. Yu, Amy Y. X. Kapral, Moira K. Austin, Peter C. Using random forests to model 90-day hometime in people with stroke |
title | Using random forests to model 90-day hometime in people with stroke |
title_full | Using random forests to model 90-day hometime in people with stroke |
title_fullStr | Using random forests to model 90-day hometime in people with stroke |
title_full_unstemmed | Using random forests to model 90-day hometime in people with stroke |
title_short | Using random forests to model 90-day hometime in people with stroke |
title_sort | using random forests to model 90-day hometime in people with stroke |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8112132/ https://www.ncbi.nlm.nih.gov/pubmed/33971827 http://dx.doi.org/10.1186/s12874-021-01289-8 |
work_keys_str_mv | AT holodinskyjessalynk usingrandomforeststomodel90dayhometimeinpeoplewithstroke AT yuamyyx usingrandomforeststomodel90dayhometimeinpeoplewithstroke AT kapralmoirak usingrandomforeststomodel90dayhometimeinpeoplewithstroke AT austinpeterc usingrandomforeststomodel90dayhometimeinpeoplewithstroke |