Cargando…
Predicting self-perceived general health status using machine learning: an external exposome study
BACKGROUND: Self-perceived general health (SPGH) is a general health indicator commonly used in epidemiological research and is associated with a wide range of exposures from different domains. However, most studies on SPGH only investigated a limited set of exposures and did not take the entire ext...
Autores principales: | , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10230687/ https://www.ncbi.nlm.nih.gov/pubmed/37259056 http://dx.doi.org/10.1186/s12889-023-15962-8 |
Sumario: | BACKGROUND: Self-perceived general health (SPGH) is a general health indicator commonly used in epidemiological research and is associated with a wide range of exposures from different domains. However, most studies on SPGH only investigated a limited set of exposures and did not take the entire external exposome into account. We aimed to develop predictive models for SPGH based on exposome datasets using machine learning techniques and identify the most important predictors of poor SPGH status. METHODS: Random forest (RF) was used on two datasets based on personal characteristics from the 2012 and 2016 editions of the Dutch national health survey, enriched with environmental and neighborhood characteristics. Model performance was determined using the area under the curve (AUC) score. The most important predictors were identified using a variable importance procedure and individual effects of exposures using partial dependence and accumulated local effect plots. The final 2012 dataset contained information on 199,840 individuals and 81 variables, whereas the final 2016 dataset had 244,557 individuals with 91 variables. RESULTS: Our RF models had overall good predictive performance (2012: AUC = 0.864 (CI: 0.852–0.876); 2016: AUC = 0.890 (CI: 0.883–0.896)) and the most important predictors were “Control of own life”, “Physical activity”, “Loneliness” and “Making ends meet”. Subjects who felt insufficiently in control of their own life, scored high on the De Jong-Gierveld loneliness scale or had difficulty in making ends meet were more likely to have poor SPGH status, whereas increased physical activity per week reduced the probability of poor SPGH. We observed associations between some neighborhood and environmental characteristics, but these variables did not contribute to the overall predictive strength of the models. CONCLUSIONS: This study identified that within an external exposome dataset, the most important predictors for SPGH status are related to mental wellbeing, physical exercise, loneliness, and financial status. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12889-023-15962-8. |
---|