Cargando…

Generalization of finetuned transformer language models to new clinical contexts

OBJECTIVE: We have previously developed a natural language processing pipeline using clinical notes written by epilepsy specialists to extract seizure freedom, seizure frequency text, and date of last seizure text for patients with epilepsy. It is important to understand how our methods generalize t...

Descripción completa

Detalles Bibliográficos
Autores principales:	Xie, Kevin, Terman, Samuel W, Gallagher, Ryan S, Hill, Chloe E, Davis, Kathryn A, Litt, Brian, Roth, Dan, Ellis, Colin A
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Oxford University Press 2023
Materias:	Research and Applications
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10432353/ https://www.ncbi.nlm.nih.gov/pubmed/37600072 http://dx.doi.org/10.1093/jamiaopen/ooad070

_version_	1785091385100075008
author	Xie, Kevin Terman, Samuel W Gallagher, Ryan S Hill, Chloe E Davis, Kathryn A Litt, Brian Roth, Dan Ellis, Colin A
author_facet	Xie, Kevin Terman, Samuel W Gallagher, Ryan S Hill, Chloe E Davis, Kathryn A Litt, Brian Roth, Dan Ellis, Colin A
author_sort	Xie, Kevin
collection	PubMed
description	OBJECTIVE: We have previously developed a natural language processing pipeline using clinical notes written by epilepsy specialists to extract seizure freedom, seizure frequency text, and date of last seizure text for patients with epilepsy. It is important to understand how our methods generalize to new care contexts. MATERIALS AND METHODS: We evaluated our pipeline on unseen notes from nonepilepsy-specialist neurologists and non-neurologists without any additional algorithm training. We tested the pipeline out-of-institution using epilepsy specialist notes from an outside medical center with only minor preprocessing adaptations. We examined reasons for discrepancies in performance in new contexts by measuring physical and semantic similarities between documents. RESULTS: Our ability to classify patient seizure freedom decreased by at least 0.12 agreement when moving from epilepsy specialists to nonspecialists or other institutions. On notes from our institution, textual overlap between the extracted outcomes and the gold standard annotations attained from manual chart review decreased by at least 0.11 F(1) when an answer existed but did not change when no answer existed; here our models generalized on notes from the outside institution, losing at most 0.02 agreement. We analyzed textual differences and found that syntactic and semantic differences in both clinically relevant sentences and surrounding contexts significantly influenced model performance. DISCUSSION AND CONCLUSION: Model generalization performance decreased on notes from nonspecialists; out-of-institution generalization on epilepsy specialist notes required small changes to preprocessing but was especially good for seizure frequency text and date of last seizure text, opening opportunities for multicenter collaborations using these outcomes.
format	Online Article Text
id	pubmed-10432353
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	Oxford University Press
record_format	MEDLINE/PubMed
spelling	pubmed-104323532023-08-18 Generalization of finetuned transformer language models to new clinical contexts Xie, Kevin Terman, Samuel W Gallagher, Ryan S Hill, Chloe E Davis, Kathryn A Litt, Brian Roth, Dan Ellis, Colin A JAMIA Open Research and Applications OBJECTIVE: We have previously developed a natural language processing pipeline using clinical notes written by epilepsy specialists to extract seizure freedom, seizure frequency text, and date of last seizure text for patients with epilepsy. It is important to understand how our methods generalize to new care contexts. MATERIALS AND METHODS: We evaluated our pipeline on unseen notes from nonepilepsy-specialist neurologists and non-neurologists without any additional algorithm training. We tested the pipeline out-of-institution using epilepsy specialist notes from an outside medical center with only minor preprocessing adaptations. We examined reasons for discrepancies in performance in new contexts by measuring physical and semantic similarities between documents. RESULTS: Our ability to classify patient seizure freedom decreased by at least 0.12 agreement when moving from epilepsy specialists to nonspecialists or other institutions. On notes from our institution, textual overlap between the extracted outcomes and the gold standard annotations attained from manual chart review decreased by at least 0.11 F(1) when an answer existed but did not change when no answer existed; here our models generalized on notes from the outside institution, losing at most 0.02 agreement. We analyzed textual differences and found that syntactic and semantic differences in both clinically relevant sentences and surrounding contexts significantly influenced model performance. DISCUSSION AND CONCLUSION: Model generalization performance decreased on notes from nonspecialists; out-of-institution generalization on epilepsy specialist notes required small changes to preprocessing but was especially good for seizure frequency text and date of last seizure text, opening opportunities for multicenter collaborations using these outcomes. Oxford University Press 2023-08-16 /pmc/articles/PMC10432353/ /pubmed/37600072 http://dx.doi.org/10.1093/jamiaopen/ooad070 Text en © The Author(s) 2023. Published by Oxford University Press on behalf of the American Medical Informatics Association. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Research and Applications Xie, Kevin Terman, Samuel W Gallagher, Ryan S Hill, Chloe E Davis, Kathryn A Litt, Brian Roth, Dan Ellis, Colin A Generalization of finetuned transformer language models to new clinical contexts
title	Generalization of finetuned transformer language models to new clinical contexts
title_full	Generalization of finetuned transformer language models to new clinical contexts
title_fullStr	Generalization of finetuned transformer language models to new clinical contexts
title_full_unstemmed	Generalization of finetuned transformer language models to new clinical contexts
title_short	Generalization of finetuned transformer language models to new clinical contexts
title_sort	generalization of finetuned transformer language models to new clinical contexts
topic	Research and Applications
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10432353/ https://www.ncbi.nlm.nih.gov/pubmed/37600072 http://dx.doi.org/10.1093/jamiaopen/ooad070
work_keys_str_mv	AT xiekevin generalizationoffinetunedtransformerlanguagemodelstonewclinicalcontexts AT termansamuelw generalizationoffinetunedtransformerlanguagemodelstonewclinicalcontexts AT gallagherryans generalizationoffinetunedtransformerlanguagemodelstonewclinicalcontexts AT hillchloee generalizationoffinetunedtransformerlanguagemodelstonewclinicalcontexts AT daviskathryna generalizationoffinetunedtransformerlanguagemodelstonewclinicalcontexts AT littbrian generalizationoffinetunedtransformerlanguagemodelstonewclinicalcontexts AT rothdan generalizationoffinetunedtransformerlanguagemodelstonewclinicalcontexts AT elliscolina generalizationoffinetunedtransformerlanguagemodelstonewclinicalcontexts

Generalization of finetuned transformer language models to new clinical contexts

Ejemplares similares