Cargando…

Improving the Efficiency of Clinical Trial Recruitment Using an Ensemble Machine Learning to Assist With Eligibility Screening

OBJECTIVE: Efficiently identifying eligible patients is a crucial first step for a successful clinical trial. The objective of this study was to test whether an approach using electronic health record (EHR) data and an ensemble machine learning algorithm incorporating billing codes and data from cli...

Descripción completa

Detalles Bibliográficos
Autores principales: Cai, Tianrun, Cai, Fiona, Dahal, Kumar P., Cremone, Gabrielle, Lam, Ethan, Golnik, Charlotte, Seyok, Thany, Hong, Chuan, Cai, Tianxi, Liao, Katherine P.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: John Wiley and Sons Inc. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8449035/
https://www.ncbi.nlm.nih.gov/pubmed/34296815
http://dx.doi.org/10.1002/acr2.11289
_version_ 1784569348326686720
author Cai, Tianrun
Cai, Fiona
Dahal, Kumar P.
Cremone, Gabrielle
Lam, Ethan
Golnik, Charlotte
Seyok, Thany
Hong, Chuan
Cai, Tianxi
Liao, Katherine P.
author_facet Cai, Tianrun
Cai, Fiona
Dahal, Kumar P.
Cremone, Gabrielle
Lam, Ethan
Golnik, Charlotte
Seyok, Thany
Hong, Chuan
Cai, Tianxi
Liao, Katherine P.
author_sort Cai, Tianrun
collection PubMed
description OBJECTIVE: Efficiently identifying eligible patients is a crucial first step for a successful clinical trial. The objective of this study was to test whether an approach using electronic health record (EHR) data and an ensemble machine learning algorithm incorporating billing codes and data from clinical notes processed by natural language processing (NLP) can improve the efficiency of eligibility screening. METHODS: We studied patients screened for a clinical trial of rheumatoid arthritis (RA) with one or more International Classification of Diseases (ICD) code for RA and age greater than 35 years, from a tertiary care center and a community hospital. The following three groups of EHR features were considered for the algorithm: 1) structured features, 2) the counts of NLP concepts from notes, 3) health care utilization. All features were linked to dates. We applied random forest and logistic regression with least absolute shrinkage and selection operator penalty against the following two standard approaches: 1) one or more RA ICD code and no ICD codes related to exclusion criteria (Screen(RAICD1) (+EX)) and 2) two or more RA ICD codes (Screen(RAICD2)). To test the portability, we trained the algorithm at one institution and tested it at the other. RESULTS: In total, 3359 patients at Brigham and Women’s Hospital (BWH) and 642 patients at Faulkner Hospital (FH) were studied, with 461 (13.7%) eligible patients at BWH and 84 (13.4%) at FH. The application of the algorithm reduced ineligible patients from chart review by 40.5% at the tertiary care center and by 57.0% at the community hospital. In contrast, Screen(RAICD2) reduced patients for chart review by 2.7% to 11.3%; Screen(RAICD1+EX) reduced patients for chart review by 63% to 65% but excluded 22% to 27% of eligible patients. CONCLUSION: The ensemble machine learning algorithm incorporating billing codes and NLP data increased the efficiency of eligibility screening by reducing the number of patients requiring chart review while not excluding eligible patients. Moreover, this approach can be trained at one institution and applied at another for multicenter clinical trials.
format Online
Article
Text
id pubmed-8449035
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher John Wiley and Sons Inc.
record_format MEDLINE/PubMed
spelling pubmed-84490352021-09-24 Improving the Efficiency of Clinical Trial Recruitment Using an Ensemble Machine Learning to Assist With Eligibility Screening Cai, Tianrun Cai, Fiona Dahal, Kumar P. Cremone, Gabrielle Lam, Ethan Golnik, Charlotte Seyok, Thany Hong, Chuan Cai, Tianxi Liao, Katherine P. ACR Open Rheumatol Original Articles OBJECTIVE: Efficiently identifying eligible patients is a crucial first step for a successful clinical trial. The objective of this study was to test whether an approach using electronic health record (EHR) data and an ensemble machine learning algorithm incorporating billing codes and data from clinical notes processed by natural language processing (NLP) can improve the efficiency of eligibility screening. METHODS: We studied patients screened for a clinical trial of rheumatoid arthritis (RA) with one or more International Classification of Diseases (ICD) code for RA and age greater than 35 years, from a tertiary care center and a community hospital. The following three groups of EHR features were considered for the algorithm: 1) structured features, 2) the counts of NLP concepts from notes, 3) health care utilization. All features were linked to dates. We applied random forest and logistic regression with least absolute shrinkage and selection operator penalty against the following two standard approaches: 1) one or more RA ICD code and no ICD codes related to exclusion criteria (Screen(RAICD1) (+EX)) and 2) two or more RA ICD codes (Screen(RAICD2)). To test the portability, we trained the algorithm at one institution and tested it at the other. RESULTS: In total, 3359 patients at Brigham and Women’s Hospital (BWH) and 642 patients at Faulkner Hospital (FH) were studied, with 461 (13.7%) eligible patients at BWH and 84 (13.4%) at FH. The application of the algorithm reduced ineligible patients from chart review by 40.5% at the tertiary care center and by 57.0% at the community hospital. In contrast, Screen(RAICD2) reduced patients for chart review by 2.7% to 11.3%; Screen(RAICD1+EX) reduced patients for chart review by 63% to 65% but excluded 22% to 27% of eligible patients. CONCLUSION: The ensemble machine learning algorithm incorporating billing codes and NLP data increased the efficiency of eligibility screening by reducing the number of patients requiring chart review while not excluding eligible patients. Moreover, this approach can be trained at one institution and applied at another for multicenter clinical trials. John Wiley and Sons Inc. 2021-07-23 /pmc/articles/PMC8449035/ /pubmed/34296815 http://dx.doi.org/10.1002/acr2.11289 Text en © 2021 The Authors. ACR Open Rheumatology published by Wiley Periodicals LLC on behalf of American College of Rheumatology. https://creativecommons.org/licenses/by-nc/4.0/This is an open access article under the terms of the http://creativecommons.org/licenses/by-nc/4.0/ (https://creativecommons.org/licenses/by-nc/4.0/) License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited and is not used for commercial purposes.
spellingShingle Original Articles
Cai, Tianrun
Cai, Fiona
Dahal, Kumar P.
Cremone, Gabrielle
Lam, Ethan
Golnik, Charlotte
Seyok, Thany
Hong, Chuan
Cai, Tianxi
Liao, Katherine P.
Improving the Efficiency of Clinical Trial Recruitment Using an Ensemble Machine Learning to Assist With Eligibility Screening
title Improving the Efficiency of Clinical Trial Recruitment Using an Ensemble Machine Learning to Assist With Eligibility Screening
title_full Improving the Efficiency of Clinical Trial Recruitment Using an Ensemble Machine Learning to Assist With Eligibility Screening
title_fullStr Improving the Efficiency of Clinical Trial Recruitment Using an Ensemble Machine Learning to Assist With Eligibility Screening
title_full_unstemmed Improving the Efficiency of Clinical Trial Recruitment Using an Ensemble Machine Learning to Assist With Eligibility Screening
title_short Improving the Efficiency of Clinical Trial Recruitment Using an Ensemble Machine Learning to Assist With Eligibility Screening
title_sort improving the efficiency of clinical trial recruitment using an ensemble machine learning to assist with eligibility screening
topic Original Articles
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8449035/
https://www.ncbi.nlm.nih.gov/pubmed/34296815
http://dx.doi.org/10.1002/acr2.11289
work_keys_str_mv AT caitianrun improvingtheefficiencyofclinicaltrialrecruitmentusinganensemblemachinelearningtoassistwitheligibilityscreening
AT caifiona improvingtheefficiencyofclinicaltrialrecruitmentusinganensemblemachinelearningtoassistwitheligibilityscreening
AT dahalkumarp improvingtheefficiencyofclinicaltrialrecruitmentusinganensemblemachinelearningtoassistwitheligibilityscreening
AT cremonegabrielle improvingtheefficiencyofclinicaltrialrecruitmentusinganensemblemachinelearningtoassistwitheligibilityscreening
AT lamethan improvingtheefficiencyofclinicaltrialrecruitmentusinganensemblemachinelearningtoassistwitheligibilityscreening
AT golnikcharlotte improvingtheefficiencyofclinicaltrialrecruitmentusinganensemblemachinelearningtoassistwitheligibilityscreening
AT seyokthany improvingtheefficiencyofclinicaltrialrecruitmentusinganensemblemachinelearningtoassistwitheligibilityscreening
AT hongchuan improvingtheefficiencyofclinicaltrialrecruitmentusinganensemblemachinelearningtoassistwitheligibilityscreening
AT caitianxi improvingtheefficiencyofclinicaltrialrecruitmentusinganensemblemachinelearningtoassistwitheligibilityscreening
AT liaokatherinep improvingtheefficiencyofclinicaltrialrecruitmentusinganensemblemachinelearningtoassistwitheligibilityscreening