Cargando…
Assessing machine learning for fair prediction of ADHD in school pupils using a retrospective cohort study of linked education and healthcare data
OBJECTIVES: Attention deficit hyperactivity disorder (ADHD) is a prevalent childhood disorder, but often goes unrecognised and untreated. To improve access to services, accurate predictions of populations at high risk of ADHD are needed for effective resource allocation. Using a unique linked health...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BMJ Publishing Group
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9723859/ https://www.ncbi.nlm.nih.gov/pubmed/36576182 http://dx.doi.org/10.1136/bmjopen-2021-058058 |
_version_ | 1784844277605466112 |
---|---|
author | Ter-Minassian, Lucile Viani, Natalia Wickersham, Alice Cross, Lauren Stewart, Robert Velupillai, Sumithra Downs, Johnny |
author_facet | Ter-Minassian, Lucile Viani, Natalia Wickersham, Alice Cross, Lauren Stewart, Robert Velupillai, Sumithra Downs, Johnny |
author_sort | Ter-Minassian, Lucile |
collection | PubMed |
description | OBJECTIVES: Attention deficit hyperactivity disorder (ADHD) is a prevalent childhood disorder, but often goes unrecognised and untreated. To improve access to services, accurate predictions of populations at high risk of ADHD are needed for effective resource allocation. Using a unique linked health and education data resource, we examined how machine learning (ML) approaches can predict risk of ADHD. DESIGN: Retrospective population cohort study. SETTING: South London (2007–2013). PARTICIPANTS: n=56 258 pupils with linked education and health data. PRIMARY OUTCOME MEASURES: Using area under the curve (AUC), we compared the predictive accuracy of four ML models and one neural network for ADHD diagnosis. Ethnic group and language biases were weighted using a fair pre-processing algorithm. RESULTS: Random forest and logistic regression prediction models provided the highest predictive accuracy for ADHD in population samples (AUC 0.86 and 0.86, respectively) and clinical samples (AUC 0.72 and 0.70). Precision-recall curve analyses were less favourable. Sociodemographic biases were effectively reduced by a fair pre-processing algorithm without loss of accuracy. CONCLUSIONS: ML approaches using linked routinely collected education and health data offer accurate, low-cost and scalable prediction models of ADHD. These approaches could help identify areas of need and inform resource allocation. Introducing ‘fairness weighting’ attenuates some sociodemographic biases which would otherwise underestimate ADHD risk within minority groups. |
format | Online Article Text |
id | pubmed-9723859 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | BMJ Publishing Group |
record_format | MEDLINE/PubMed |
spelling | pubmed-97238592022-12-07 Assessing machine learning for fair prediction of ADHD in school pupils using a retrospective cohort study of linked education and healthcare data Ter-Minassian, Lucile Viani, Natalia Wickersham, Alice Cross, Lauren Stewart, Robert Velupillai, Sumithra Downs, Johnny BMJ Open Mental Health OBJECTIVES: Attention deficit hyperactivity disorder (ADHD) is a prevalent childhood disorder, but often goes unrecognised and untreated. To improve access to services, accurate predictions of populations at high risk of ADHD are needed for effective resource allocation. Using a unique linked health and education data resource, we examined how machine learning (ML) approaches can predict risk of ADHD. DESIGN: Retrospective population cohort study. SETTING: South London (2007–2013). PARTICIPANTS: n=56 258 pupils with linked education and health data. PRIMARY OUTCOME MEASURES: Using area under the curve (AUC), we compared the predictive accuracy of four ML models and one neural network for ADHD diagnosis. Ethnic group and language biases were weighted using a fair pre-processing algorithm. RESULTS: Random forest and logistic regression prediction models provided the highest predictive accuracy for ADHD in population samples (AUC 0.86 and 0.86, respectively) and clinical samples (AUC 0.72 and 0.70). Precision-recall curve analyses were less favourable. Sociodemographic biases were effectively reduced by a fair pre-processing algorithm without loss of accuracy. CONCLUSIONS: ML approaches using linked routinely collected education and health data offer accurate, low-cost and scalable prediction models of ADHD. These approaches could help identify areas of need and inform resource allocation. Introducing ‘fairness weighting’ attenuates some sociodemographic biases which would otherwise underestimate ADHD risk within minority groups. BMJ Publishing Group 2022-12-05 /pmc/articles/PMC9723859/ /pubmed/36576182 http://dx.doi.org/10.1136/bmjopen-2021-058058 Text en © Author(s) (or their employer(s)) 2022. Re-use permitted under CC BY. Published by BMJ. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed in accordance with the Creative Commons Attribution 4.0 Unported (CC BY 4.0) license, which permits others to copy, redistribute, remix, transform and build upon this work for any purpose, provided the original work is properly cited, a link to the licence is given, and indication of whether changes were made. See: https://creativecommons.org/licenses/by/4.0/. |
spellingShingle | Mental Health Ter-Minassian, Lucile Viani, Natalia Wickersham, Alice Cross, Lauren Stewart, Robert Velupillai, Sumithra Downs, Johnny Assessing machine learning for fair prediction of ADHD in school pupils using a retrospective cohort study of linked education and healthcare data |
title | Assessing machine learning for fair prediction of ADHD in school pupils using a retrospective cohort study of linked education and healthcare data |
title_full | Assessing machine learning for fair prediction of ADHD in school pupils using a retrospective cohort study of linked education and healthcare data |
title_fullStr | Assessing machine learning for fair prediction of ADHD in school pupils using a retrospective cohort study of linked education and healthcare data |
title_full_unstemmed | Assessing machine learning for fair prediction of ADHD in school pupils using a retrospective cohort study of linked education and healthcare data |
title_short | Assessing machine learning for fair prediction of ADHD in school pupils using a retrospective cohort study of linked education and healthcare data |
title_sort | assessing machine learning for fair prediction of adhd in school pupils using a retrospective cohort study of linked education and healthcare data |
topic | Mental Health |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9723859/ https://www.ncbi.nlm.nih.gov/pubmed/36576182 http://dx.doi.org/10.1136/bmjopen-2021-058058 |
work_keys_str_mv | AT terminassianlucile assessingmachinelearningforfairpredictionofadhdinschoolpupilsusingaretrospectivecohortstudyoflinkededucationandhealthcaredata AT vianinatalia assessingmachinelearningforfairpredictionofadhdinschoolpupilsusingaretrospectivecohortstudyoflinkededucationandhealthcaredata AT wickershamalice assessingmachinelearningforfairpredictionofadhdinschoolpupilsusingaretrospectivecohortstudyoflinkededucationandhealthcaredata AT crosslauren assessingmachinelearningforfairpredictionofadhdinschoolpupilsusingaretrospectivecohortstudyoflinkededucationandhealthcaredata AT stewartrobert assessingmachinelearningforfairpredictionofadhdinschoolpupilsusingaretrospectivecohortstudyoflinkededucationandhealthcaredata AT velupillaisumithra assessingmachinelearningforfairpredictionofadhdinschoolpupilsusingaretrospectivecohortstudyoflinkededucationandhealthcaredata AT downsjohnny assessingmachinelearningforfairpredictionofadhdinschoolpupilsusingaretrospectivecohortstudyoflinkededucationandhealthcaredata |