Cargando…
Finding undiagnosed patients with hepatitis C virus: an application of machine learning to US ambulatory electronic medical records
AIMS: To develop and validate a machine learning (ML) algorithm to identify undiagnosed hepatitis C virus (HCV) patients, in order to facilitate prioritisation of patients for targeted HCV screening. METHODS: This retrospective study used ambulatory electronic medical records (EMR) from January 2015...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BMJ Publishing Group
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9843171/ https://www.ncbi.nlm.nih.gov/pubmed/36639190 http://dx.doi.org/10.1136/bmjhci-2022-100651 |
_version_ | 1784870326027419648 |
---|---|
author | Rigg, John Doyle, Orla McDonogh, Niamh Leavitt, Nadea Ali, Rehan Son, Annie Kreter, Bruce |
author_facet | Rigg, John Doyle, Orla McDonogh, Niamh Leavitt, Nadea Ali, Rehan Son, Annie Kreter, Bruce |
author_sort | Rigg, John |
collection | PubMed |
description | AIMS: To develop and validate a machine learning (ML) algorithm to identify undiagnosed hepatitis C virus (HCV) patients, in order to facilitate prioritisation of patients for targeted HCV screening. METHODS: This retrospective study used ambulatory electronic medical records (EMR) from January 2015 to February 2020. A Gradient Boosting Trees algorithm was trained using patient records to predict initial HCV diagnosis and was validated on a temporally independent held-out cross-section of the data. The fold improvement in precision (proportion of patients identified by the algorithm who are HCV positive) over universal screening was examined and compared with risk-based screening. RESULTS: 21 508 positive (HCV diagnosed) and 28.2M unlabelled (lacking evidence of HCV diagnosis) patients met the inclusion criteria for the study. After down-sampling unlabelled patients to aid the algorithm’s learning process, 16.2M unlabelled patients entered the analysis. Performance of the algorithm was compared with universal screening on the held-out cross-section, which had an incidence of HCV diagnoses of 0.02%. The algorithm achieved a 101.0 ×, 18.0 × and 5.1 × fold improvement in precision over universal screening at 5%, 20% and 50% levels of recall. When compared with risk-based screening, the algorithm required fewer patients to be screened and improved precision. CONCLUSIONS: This study presents strong evidence towards the use of ML on EMR data for the prioritisation of patients for targeted HCV testing with potential to improve efficiency of resource utilisation, thereby reducing the workload for clinicians and saving healthcare costs. A prospective interventional study would allow for further validation before use in a clinical setting. |
format | Online Article Text |
id | pubmed-9843171 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | BMJ Publishing Group |
record_format | MEDLINE/PubMed |
spelling | pubmed-98431712023-01-18 Finding undiagnosed patients with hepatitis C virus: an application of machine learning to US ambulatory electronic medical records Rigg, John Doyle, Orla McDonogh, Niamh Leavitt, Nadea Ali, Rehan Son, Annie Kreter, Bruce BMJ Health Care Inform Original Research AIMS: To develop and validate a machine learning (ML) algorithm to identify undiagnosed hepatitis C virus (HCV) patients, in order to facilitate prioritisation of patients for targeted HCV screening. METHODS: This retrospective study used ambulatory electronic medical records (EMR) from January 2015 to February 2020. A Gradient Boosting Trees algorithm was trained using patient records to predict initial HCV diagnosis and was validated on a temporally independent held-out cross-section of the data. The fold improvement in precision (proportion of patients identified by the algorithm who are HCV positive) over universal screening was examined and compared with risk-based screening. RESULTS: 21 508 positive (HCV diagnosed) and 28.2M unlabelled (lacking evidence of HCV diagnosis) patients met the inclusion criteria for the study. After down-sampling unlabelled patients to aid the algorithm’s learning process, 16.2M unlabelled patients entered the analysis. Performance of the algorithm was compared with universal screening on the held-out cross-section, which had an incidence of HCV diagnoses of 0.02%. The algorithm achieved a 101.0 ×, 18.0 × and 5.1 × fold improvement in precision over universal screening at 5%, 20% and 50% levels of recall. When compared with risk-based screening, the algorithm required fewer patients to be screened and improved precision. CONCLUSIONS: This study presents strong evidence towards the use of ML on EMR data for the prioritisation of patients for targeted HCV testing with potential to improve efficiency of resource utilisation, thereby reducing the workload for clinicians and saving healthcare costs. A prospective interventional study would allow for further validation before use in a clinical setting. BMJ Publishing Group 2023-01-13 /pmc/articles/PMC9843171/ /pubmed/36639190 http://dx.doi.org/10.1136/bmjhci-2022-100651 Text en © Author(s) (or their employer(s)) 2023. Re-use permitted under CC BY-NC. No commercial re-use. See rights and permissions. Published by BMJ. https://creativecommons.org/licenses/by-nc/4.0/This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/ (https://creativecommons.org/licenses/by-nc/4.0/) . |
spellingShingle | Original Research Rigg, John Doyle, Orla McDonogh, Niamh Leavitt, Nadea Ali, Rehan Son, Annie Kreter, Bruce Finding undiagnosed patients with hepatitis C virus: an application of machine learning to US ambulatory electronic medical records |
title | Finding undiagnosed patients with hepatitis C virus: an application of machine learning to US ambulatory electronic medical records |
title_full | Finding undiagnosed patients with hepatitis C virus: an application of machine learning to US ambulatory electronic medical records |
title_fullStr | Finding undiagnosed patients with hepatitis C virus: an application of machine learning to US ambulatory electronic medical records |
title_full_unstemmed | Finding undiagnosed patients with hepatitis C virus: an application of machine learning to US ambulatory electronic medical records |
title_short | Finding undiagnosed patients with hepatitis C virus: an application of machine learning to US ambulatory electronic medical records |
title_sort | finding undiagnosed patients with hepatitis c virus: an application of machine learning to us ambulatory electronic medical records |
topic | Original Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9843171/ https://www.ncbi.nlm.nih.gov/pubmed/36639190 http://dx.doi.org/10.1136/bmjhci-2022-100651 |
work_keys_str_mv | AT riggjohn findingundiagnosedpatientswithhepatitiscvirusanapplicationofmachinelearningtousambulatoryelectronicmedicalrecords AT doyleorla findingundiagnosedpatientswithhepatitiscvirusanapplicationofmachinelearningtousambulatoryelectronicmedicalrecords AT mcdonoghniamh findingundiagnosedpatientswithhepatitiscvirusanapplicationofmachinelearningtousambulatoryelectronicmedicalrecords AT leavittnadea findingundiagnosedpatientswithhepatitiscvirusanapplicationofmachinelearningtousambulatoryelectronicmedicalrecords AT alirehan findingundiagnosedpatientswithhepatitiscvirusanapplicationofmachinelearningtousambulatoryelectronicmedicalrecords AT sonannie findingundiagnosedpatientswithhepatitiscvirusanapplicationofmachinelearningtousambulatoryelectronicmedicalrecords AT kreterbruce findingundiagnosedpatientswithhepatitiscvirusanapplicationofmachinelearningtousambulatoryelectronicmedicalrecords |