Cargando…

One Clinician Is All You Need–Cardiac Magnetic Resonance Imaging Measurement Extraction: Deep Learning Algorithm Development

BACKGROUND: Cardiac magnetic resonance imaging (CMR) is a powerful diagnostic modality that provides detailed quantitative assessment of cardiac anatomy and function. Automated extraction of CMR measurements from clinical reports that are typically stored as unstructured text in electronic health re...

Descripción completa

Detalles Bibliográficos
Autores principales: Singh, Pulkit, Haimovich, Julian, Reeder, Christopher, Khurshid, Shaan, Lau, Emily S, Cunningham, Jonathan W, Philippakis, Anthony, Anderson, Christopher D, Ho, Jennifer E, Lubitz, Steven A, Batra, Puneet
Formato: Online Artículo Texto
Lenguaje:English
Publicado: JMIR Publications 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9526125/
https://www.ncbi.nlm.nih.gov/pubmed/35960155
http://dx.doi.org/10.2196/38178
_version_ 1784800810239000576
author Singh, Pulkit
Haimovich, Julian
Reeder, Christopher
Khurshid, Shaan
Lau, Emily S
Cunningham, Jonathan W
Philippakis, Anthony
Anderson, Christopher D
Ho, Jennifer E
Lubitz, Steven A
Batra, Puneet
author_facet Singh, Pulkit
Haimovich, Julian
Reeder, Christopher
Khurshid, Shaan
Lau, Emily S
Cunningham, Jonathan W
Philippakis, Anthony
Anderson, Christopher D
Ho, Jennifer E
Lubitz, Steven A
Batra, Puneet
author_sort Singh, Pulkit
collection PubMed
description BACKGROUND: Cardiac magnetic resonance imaging (CMR) is a powerful diagnostic modality that provides detailed quantitative assessment of cardiac anatomy and function. Automated extraction of CMR measurements from clinical reports that are typically stored as unstructured text in electronic health record systems would facilitate their use in research. Existing machine learning approaches either rely on large quantities of expert annotation or require the development of engineered rules that are time-consuming and are specific to the setting in which they were developed. OBJECTIVE: We hypothesize that the use of pretrained transformer-based language models may enable label-efficient numerical extraction from clinical text without the need for heuristics or large quantities of expert annotations. Here, we fine-tuned pretrained transformer-based language models on a small quantity of CMR annotations to extract 21 CMR measurements. We assessed the effect of clinical pretraining to reduce labeling needs and explored alternative representations of numerical inputs to improve performance. METHODS: Our study sample comprised 99,252 patients that received longitudinal cardiology care in a multi-institutional health care system. There were 12,720 available CMR reports from 9280 patients. We adapted PRAnCER (Platform Enabling Rapid Annotation for Clinical Entity Recognition), an annotation tool for clinical text, to collect annotations from a study clinician on 370 reports. We experimented with 5 different representations of numerical quantities and several model weight initializations. We evaluated extraction performance using macroaveraged F(1)-scores across the measurements of interest. We applied the best-performing model to extract measurements from the remaining CMR reports in the study sample and evaluated established associations between selected extracted measures with clinical outcomes to demonstrate validity. RESULTS: All combinations of weight initializations and numerical representations obtained excellent performance on the gold-standard test set, suggesting that transformer models fine-tuned on a small set of annotations can effectively extract numerical quantities. Our results further indicate that custom numerical representations did not appear to have a significant impact on extraction performance. The best-performing model achieved a macroaveraged F(1)-score of 0.957 across the evaluated CMR measurements (range 0.92 for the lowest-performing measure of left atrial anterior-posterior dimension to 1.0 for the highest-performing measures of left ventricular end systolic volume index and left ventricular end systolic diameter). Application of the best-performing model to the study cohort yielded 136,407 measurements from all available reports in the study sample. We observed expected associations between extracted left ventricular mass index, left ventricular ejection fraction, and right ventricular ejection fraction with clinical outcomes like atrial fibrillation, heart failure, and mortality. CONCLUSIONS: This study demonstrated that a domain-agnostic pretrained transformer model is able to effectively extract quantitative clinical measurements from diagnostic reports with a relatively small number of gold-standard annotations. The proposed workflow may serve as a roadmap for other quantitative entity extraction.
format Online
Article
Text
id pubmed-9526125
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher JMIR Publications
record_format MEDLINE/PubMed
spelling pubmed-95261252022-10-02 One Clinician Is All You Need–Cardiac Magnetic Resonance Imaging Measurement Extraction: Deep Learning Algorithm Development Singh, Pulkit Haimovich, Julian Reeder, Christopher Khurshid, Shaan Lau, Emily S Cunningham, Jonathan W Philippakis, Anthony Anderson, Christopher D Ho, Jennifer E Lubitz, Steven A Batra, Puneet JMIR Med Inform Original Paper BACKGROUND: Cardiac magnetic resonance imaging (CMR) is a powerful diagnostic modality that provides detailed quantitative assessment of cardiac anatomy and function. Automated extraction of CMR measurements from clinical reports that are typically stored as unstructured text in electronic health record systems would facilitate their use in research. Existing machine learning approaches either rely on large quantities of expert annotation or require the development of engineered rules that are time-consuming and are specific to the setting in which they were developed. OBJECTIVE: We hypothesize that the use of pretrained transformer-based language models may enable label-efficient numerical extraction from clinical text without the need for heuristics or large quantities of expert annotations. Here, we fine-tuned pretrained transformer-based language models on a small quantity of CMR annotations to extract 21 CMR measurements. We assessed the effect of clinical pretraining to reduce labeling needs and explored alternative representations of numerical inputs to improve performance. METHODS: Our study sample comprised 99,252 patients that received longitudinal cardiology care in a multi-institutional health care system. There were 12,720 available CMR reports from 9280 patients. We adapted PRAnCER (Platform Enabling Rapid Annotation for Clinical Entity Recognition), an annotation tool for clinical text, to collect annotations from a study clinician on 370 reports. We experimented with 5 different representations of numerical quantities and several model weight initializations. We evaluated extraction performance using macroaveraged F(1)-scores across the measurements of interest. We applied the best-performing model to extract measurements from the remaining CMR reports in the study sample and evaluated established associations between selected extracted measures with clinical outcomes to demonstrate validity. RESULTS: All combinations of weight initializations and numerical representations obtained excellent performance on the gold-standard test set, suggesting that transformer models fine-tuned on a small set of annotations can effectively extract numerical quantities. Our results further indicate that custom numerical representations did not appear to have a significant impact on extraction performance. The best-performing model achieved a macroaveraged F(1)-score of 0.957 across the evaluated CMR measurements (range 0.92 for the lowest-performing measure of left atrial anterior-posterior dimension to 1.0 for the highest-performing measures of left ventricular end systolic volume index and left ventricular end systolic diameter). Application of the best-performing model to the study cohort yielded 136,407 measurements from all available reports in the study sample. We observed expected associations between extracted left ventricular mass index, left ventricular ejection fraction, and right ventricular ejection fraction with clinical outcomes like atrial fibrillation, heart failure, and mortality. CONCLUSIONS: This study demonstrated that a domain-agnostic pretrained transformer model is able to effectively extract quantitative clinical measurements from diagnostic reports with a relatively small number of gold-standard annotations. The proposed workflow may serve as a roadmap for other quantitative entity extraction. JMIR Publications 2022-09-16 /pmc/articles/PMC9526125/ /pubmed/35960155 http://dx.doi.org/10.2196/38178 Text en ©Pulkit Singh, Julian Haimovich, Christopher Reeder, Shaan Khurshid, Emily S Lau, Jonathan W Cunningham, Anthony Philippakis, Christopher D Anderson, Jennifer E Ho, Steven A Lubitz, Puneet Batra. Originally published in JMIR Medical Informatics (https://medinform.jmir.org), 16.09.2022. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Medical Informatics, is properly cited. The complete bibliographic information, a link to the original publication on https://medinform.jmir.org/, as well as this copyright and license information must be included.
spellingShingle Original Paper
Singh, Pulkit
Haimovich, Julian
Reeder, Christopher
Khurshid, Shaan
Lau, Emily S
Cunningham, Jonathan W
Philippakis, Anthony
Anderson, Christopher D
Ho, Jennifer E
Lubitz, Steven A
Batra, Puneet
One Clinician Is All You Need–Cardiac Magnetic Resonance Imaging Measurement Extraction: Deep Learning Algorithm Development
title One Clinician Is All You Need–Cardiac Magnetic Resonance Imaging Measurement Extraction: Deep Learning Algorithm Development
title_full One Clinician Is All You Need–Cardiac Magnetic Resonance Imaging Measurement Extraction: Deep Learning Algorithm Development
title_fullStr One Clinician Is All You Need–Cardiac Magnetic Resonance Imaging Measurement Extraction: Deep Learning Algorithm Development
title_full_unstemmed One Clinician Is All You Need–Cardiac Magnetic Resonance Imaging Measurement Extraction: Deep Learning Algorithm Development
title_short One Clinician Is All You Need–Cardiac Magnetic Resonance Imaging Measurement Extraction: Deep Learning Algorithm Development
title_sort one clinician is all you need–cardiac magnetic resonance imaging measurement extraction: deep learning algorithm development
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9526125/
https://www.ncbi.nlm.nih.gov/pubmed/35960155
http://dx.doi.org/10.2196/38178
work_keys_str_mv AT singhpulkit oneclinicianisallyouneedcardiacmagneticresonanceimagingmeasurementextractiondeeplearningalgorithmdevelopment
AT haimovichjulian oneclinicianisallyouneedcardiacmagneticresonanceimagingmeasurementextractiondeeplearningalgorithmdevelopment
AT reederchristopher oneclinicianisallyouneedcardiacmagneticresonanceimagingmeasurementextractiondeeplearningalgorithmdevelopment
AT khurshidshaan oneclinicianisallyouneedcardiacmagneticresonanceimagingmeasurementextractiondeeplearningalgorithmdevelopment
AT lauemilys oneclinicianisallyouneedcardiacmagneticresonanceimagingmeasurementextractiondeeplearningalgorithmdevelopment
AT cunninghamjonathanw oneclinicianisallyouneedcardiacmagneticresonanceimagingmeasurementextractiondeeplearningalgorithmdevelopment
AT philippakisanthony oneclinicianisallyouneedcardiacmagneticresonanceimagingmeasurementextractiondeeplearningalgorithmdevelopment
AT andersonchristopherd oneclinicianisallyouneedcardiacmagneticresonanceimagingmeasurementextractiondeeplearningalgorithmdevelopment
AT hojennifere oneclinicianisallyouneedcardiacmagneticresonanceimagingmeasurementextractiondeeplearningalgorithmdevelopment
AT lubitzstevena oneclinicianisallyouneedcardiacmagneticresonanceimagingmeasurementextractiondeeplearningalgorithmdevelopment
AT batrapuneet oneclinicianisallyouneedcardiacmagneticresonanceimagingmeasurementextractiondeeplearningalgorithmdevelopment