Cargando…
Explainable statistical learning in public health for policy development: the case of real-world suicide data
BACKGROUND: In recent years, the availability of publicly available data related to public health has significantly increased. These data have substantial potential to develop public health policy; however, this requires meaningful and insightful analysis. Our aim is to demonstrate how data analysis...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6636096/ https://www.ncbi.nlm.nih.gov/pubmed/31315579 http://dx.doi.org/10.1186/s12874-019-0796-7 |
_version_ | 1783436002554019840 |
---|---|
author | van Schaik, Paul Peng, Yonghong Ojelabi, Adedokun Ling, Jonathan |
author_facet | van Schaik, Paul Peng, Yonghong Ojelabi, Adedokun Ling, Jonathan |
author_sort | van Schaik, Paul |
collection | PubMed |
description | BACKGROUND: In recent years, the availability of publicly available data related to public health has significantly increased. These data have substantial potential to develop public health policy; however, this requires meaningful and insightful analysis. Our aim is to demonstrate how data analysis techniques can be used to address the issues of data reduction, prediction and explanation using online available public health data, in order to provide a sound basis for informing public health policy. METHODS: Observational suicide prevention data were analysed from an existing online United Kingdom national public health database. Multi-collinearity analysis and principal-component analysis were used to reduce correlated data, followed by regression analyses for prediction and explanation of suicide. RESULTS: Multi-collinearity analysis was effective in reducing the indicator set of predictors by 30% and principal component analysis further reduced the set by 86%. Regression for prediction identified four significant indicator predictors of suicide behaviour (emergency hospital admissions for intentional self-harm, children leaving care, statutory homelessness and self-reported well-being/low happiness) and two main component predictors (relatedness dysfunction, and behavioural problems and mental illness). Regression for explanation identified significant moderation of a well-being predictor (low happiness) of suicide behaviour by a social factor (living alone), thereby supporting existing theory and providing insight beyond the results of regression for prediction. Two independent predictors capturing relatedness needs in social care service delivery were also identified. CONCLUSIONS: We demonstrate the effectiveness of regression techniques in the analysis of online public health data. Regression analysis for prediction and explanation can both be appropriate for public health data analysis for a better understanding of public health outcomes. It is therefore essential to clarify the aim of the analysis (prediction accuracy or theory development) as a basis for choosing the most appropriate model. We apply these techniques to the analysis of suicide data; however, we argue that the analysis presented in this study should be applied to datasets across public health in order to improve the quality of health policy recommendations. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12874-019-0796-7) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-6636096 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-66360962019-07-25 Explainable statistical learning in public health for policy development: the case of real-world suicide data van Schaik, Paul Peng, Yonghong Ojelabi, Adedokun Ling, Jonathan BMC Med Res Methodol Research Article BACKGROUND: In recent years, the availability of publicly available data related to public health has significantly increased. These data have substantial potential to develop public health policy; however, this requires meaningful and insightful analysis. Our aim is to demonstrate how data analysis techniques can be used to address the issues of data reduction, prediction and explanation using online available public health data, in order to provide a sound basis for informing public health policy. METHODS: Observational suicide prevention data were analysed from an existing online United Kingdom national public health database. Multi-collinearity analysis and principal-component analysis were used to reduce correlated data, followed by regression analyses for prediction and explanation of suicide. RESULTS: Multi-collinearity analysis was effective in reducing the indicator set of predictors by 30% and principal component analysis further reduced the set by 86%. Regression for prediction identified four significant indicator predictors of suicide behaviour (emergency hospital admissions for intentional self-harm, children leaving care, statutory homelessness and self-reported well-being/low happiness) and two main component predictors (relatedness dysfunction, and behavioural problems and mental illness). Regression for explanation identified significant moderation of a well-being predictor (low happiness) of suicide behaviour by a social factor (living alone), thereby supporting existing theory and providing insight beyond the results of regression for prediction. Two independent predictors capturing relatedness needs in social care service delivery were also identified. CONCLUSIONS: We demonstrate the effectiveness of regression techniques in the analysis of online public health data. Regression analysis for prediction and explanation can both be appropriate for public health data analysis for a better understanding of public health outcomes. It is therefore essential to clarify the aim of the analysis (prediction accuracy or theory development) as a basis for choosing the most appropriate model. We apply these techniques to the analysis of suicide data; however, we argue that the analysis presented in this study should be applied to datasets across public health in order to improve the quality of health policy recommendations. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12874-019-0796-7) contains supplementary material, which is available to authorized users. BioMed Central 2019-07-17 /pmc/articles/PMC6636096/ /pubmed/31315579 http://dx.doi.org/10.1186/s12874-019-0796-7 Text en © The Author(s). 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Article van Schaik, Paul Peng, Yonghong Ojelabi, Adedokun Ling, Jonathan Explainable statistical learning in public health for policy development: the case of real-world suicide data |
title | Explainable statistical learning in public health for policy development: the case of real-world suicide data |
title_full | Explainable statistical learning in public health for policy development: the case of real-world suicide data |
title_fullStr | Explainable statistical learning in public health for policy development: the case of real-world suicide data |
title_full_unstemmed | Explainable statistical learning in public health for policy development: the case of real-world suicide data |
title_short | Explainable statistical learning in public health for policy development: the case of real-world suicide data |
title_sort | explainable statistical learning in public health for policy development: the case of real-world suicide data |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6636096/ https://www.ncbi.nlm.nih.gov/pubmed/31315579 http://dx.doi.org/10.1186/s12874-019-0796-7 |
work_keys_str_mv | AT vanschaikpaul explainablestatisticallearninginpublichealthforpolicydevelopmentthecaseofrealworldsuicidedata AT pengyonghong explainablestatisticallearninginpublichealthforpolicydevelopmentthecaseofrealworldsuicidedata AT ojelabiadedokun explainablestatisticallearninginpublichealthforpolicydevelopmentthecaseofrealworldsuicidedata AT lingjonathan explainablestatisticallearninginpublichealthforpolicydevelopmentthecaseofrealworldsuicidedata |