Cargando…

Developing a Cancer Digital Twin: Supervised Metastases Detection From Consecutive Structured Radiology Reports

The development of digital cancer twins relies on the capture of high-resolution representations of individual cancer patients throughout the course of their treatment. Our research aims to improve the detection of metastatic disease over time from structured radiology reports by exposing prediction...

Descripción completa

Detalles Bibliográficos
Autores principales: Batch, Karen E., Yue, Jianwei, Darcovich, Alex, Lupton, Kaelan, Liu, Corinne C., Woodlock, David P., El Amine, Mohammad Ali K., Causa-Andrieu, Pamela I., Gazit, Lior, Nguyen, Gary H., Zulkernine, Farhana, Do, Richard K. G., Simpson, Amber L.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8924403/
https://www.ncbi.nlm.nih.gov/pubmed/35310959
http://dx.doi.org/10.3389/frai.2022.826402
_version_ 1784669847660003328
author Batch, Karen E.
Yue, Jianwei
Darcovich, Alex
Lupton, Kaelan
Liu, Corinne C.
Woodlock, David P.
El Amine, Mohammad Ali K.
Causa-Andrieu, Pamela I.
Gazit, Lior
Nguyen, Gary H.
Zulkernine, Farhana
Do, Richard K. G.
Simpson, Amber L.
author_facet Batch, Karen E.
Yue, Jianwei
Darcovich, Alex
Lupton, Kaelan
Liu, Corinne C.
Woodlock, David P.
El Amine, Mohammad Ali K.
Causa-Andrieu, Pamela I.
Gazit, Lior
Nguyen, Gary H.
Zulkernine, Farhana
Do, Richard K. G.
Simpson, Amber L.
author_sort Batch, Karen E.
collection PubMed
description The development of digital cancer twins relies on the capture of high-resolution representations of individual cancer patients throughout the course of their treatment. Our research aims to improve the detection of metastatic disease over time from structured radiology reports by exposing prediction models to historical information. We demonstrate that Natural language processing (NLP) can generate better weak labels for semi-supervised classification of computed tomography (CT) reports when it is exposed to consecutive reports through a patient's treatment history. Around 714,454 structured radiology reports from Memorial Sloan Kettering Cancer Center adhering to a standardized departmental structured template were used for model development with a subset of the reports included for validation. To develop the models, a subset of the reports was curated for ground-truth: 7,732 total reports in the lung metastases dataset from 867 individual patients; 2,777 reports in the liver metastases dataset from 315 patients; and 4,107 reports in the adrenal metastases dataset from 404 patients. We use NLP to extract and encode important features from the structured text reports, which are then used to develop, train, and validate models. Three models—a simple convolutional neural network (CNN), a CNN augmented with an attention layer, and a recurrent neural network (RNN)—were developed to classify the type of metastatic disease and validated against the ground truth labels. The models use features from consecutive structured text radiology reports of a patient to predict the presence of metastatic disease in the reports. A single-report model, previously developed to analyze one report instead of multiple past reports, is included and the results from all four models are compared based on accuracy, precision, recall, and F1-score. The best model is used to label all 714,454 reports to generate metastases maps. Our results suggest that NLP models can extract cancer progression patterns from multiple consecutive reports and predict the presence of metastatic disease in multiple organs with higher performance when compared with a single-report-based prediction. It demonstrates a promising automated approach to label large numbers of radiology reports without involving human experts in a time- and cost-effective manner and enables tracking of cancer progression over time.
format Online
Article
Text
id pubmed-8924403
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-89244032022-03-17 Developing a Cancer Digital Twin: Supervised Metastases Detection From Consecutive Structured Radiology Reports Batch, Karen E. Yue, Jianwei Darcovich, Alex Lupton, Kaelan Liu, Corinne C. Woodlock, David P. El Amine, Mohammad Ali K. Causa-Andrieu, Pamela I. Gazit, Lior Nguyen, Gary H. Zulkernine, Farhana Do, Richard K. G. Simpson, Amber L. Front Artif Intell Artificial Intelligence The development of digital cancer twins relies on the capture of high-resolution representations of individual cancer patients throughout the course of their treatment. Our research aims to improve the detection of metastatic disease over time from structured radiology reports by exposing prediction models to historical information. We demonstrate that Natural language processing (NLP) can generate better weak labels for semi-supervised classification of computed tomography (CT) reports when it is exposed to consecutive reports through a patient's treatment history. Around 714,454 structured radiology reports from Memorial Sloan Kettering Cancer Center adhering to a standardized departmental structured template were used for model development with a subset of the reports included for validation. To develop the models, a subset of the reports was curated for ground-truth: 7,732 total reports in the lung metastases dataset from 867 individual patients; 2,777 reports in the liver metastases dataset from 315 patients; and 4,107 reports in the adrenal metastases dataset from 404 patients. We use NLP to extract and encode important features from the structured text reports, which are then used to develop, train, and validate models. Three models—a simple convolutional neural network (CNN), a CNN augmented with an attention layer, and a recurrent neural network (RNN)—were developed to classify the type of metastatic disease and validated against the ground truth labels. The models use features from consecutive structured text radiology reports of a patient to predict the presence of metastatic disease in the reports. A single-report model, previously developed to analyze one report instead of multiple past reports, is included and the results from all four models are compared based on accuracy, precision, recall, and F1-score. The best model is used to label all 714,454 reports to generate metastases maps. Our results suggest that NLP models can extract cancer progression patterns from multiple consecutive reports and predict the presence of metastatic disease in multiple organs with higher performance when compared with a single-report-based prediction. It demonstrates a promising automated approach to label large numbers of radiology reports without involving human experts in a time- and cost-effective manner and enables tracking of cancer progression over time. Frontiers Media S.A. 2022-03-02 /pmc/articles/PMC8924403/ /pubmed/35310959 http://dx.doi.org/10.3389/frai.2022.826402 Text en Copyright © 2022 Batch, Yue, Darcovich, Lupton, Liu, Woodlock, El Amine, Causa-Andrieu, Gazit, Nguyen, Zulkernine, Do and Simpson. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Artificial Intelligence
Batch, Karen E.
Yue, Jianwei
Darcovich, Alex
Lupton, Kaelan
Liu, Corinne C.
Woodlock, David P.
El Amine, Mohammad Ali K.
Causa-Andrieu, Pamela I.
Gazit, Lior
Nguyen, Gary H.
Zulkernine, Farhana
Do, Richard K. G.
Simpson, Amber L.
Developing a Cancer Digital Twin: Supervised Metastases Detection From Consecutive Structured Radiology Reports
title Developing a Cancer Digital Twin: Supervised Metastases Detection From Consecutive Structured Radiology Reports
title_full Developing a Cancer Digital Twin: Supervised Metastases Detection From Consecutive Structured Radiology Reports
title_fullStr Developing a Cancer Digital Twin: Supervised Metastases Detection From Consecutive Structured Radiology Reports
title_full_unstemmed Developing a Cancer Digital Twin: Supervised Metastases Detection From Consecutive Structured Radiology Reports
title_short Developing a Cancer Digital Twin: Supervised Metastases Detection From Consecutive Structured Radiology Reports
title_sort developing a cancer digital twin: supervised metastases detection from consecutive structured radiology reports
topic Artificial Intelligence
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8924403/
https://www.ncbi.nlm.nih.gov/pubmed/35310959
http://dx.doi.org/10.3389/frai.2022.826402
work_keys_str_mv AT batchkarene developingacancerdigitaltwinsupervisedmetastasesdetectionfromconsecutivestructuredradiologyreports
AT yuejianwei developingacancerdigitaltwinsupervisedmetastasesdetectionfromconsecutivestructuredradiologyreports
AT darcovichalex developingacancerdigitaltwinsupervisedmetastasesdetectionfromconsecutivestructuredradiologyreports
AT luptonkaelan developingacancerdigitaltwinsupervisedmetastasesdetectionfromconsecutivestructuredradiologyreports
AT liucorinnec developingacancerdigitaltwinsupervisedmetastasesdetectionfromconsecutivestructuredradiologyreports
AT woodlockdavidp developingacancerdigitaltwinsupervisedmetastasesdetectionfromconsecutivestructuredradiologyreports
AT elaminemohammadalik developingacancerdigitaltwinsupervisedmetastasesdetectionfromconsecutivestructuredradiologyreports
AT causaandrieupamelai developingacancerdigitaltwinsupervisedmetastasesdetectionfromconsecutivestructuredradiologyreports
AT gazitlior developingacancerdigitaltwinsupervisedmetastasesdetectionfromconsecutivestructuredradiologyreports
AT nguyengaryh developingacancerdigitaltwinsupervisedmetastasesdetectionfromconsecutivestructuredradiologyreports
AT zulkerninefarhana developingacancerdigitaltwinsupervisedmetastasesdetectionfromconsecutivestructuredradiologyreports
AT dorichardkg developingacancerdigitaltwinsupervisedmetastasesdetectionfromconsecutivestructuredradiologyreports
AT simpsonamberl developingacancerdigitaltwinsupervisedmetastasesdetectionfromconsecutivestructuredradiologyreports