Cargando…
Developing a Cancer Digital Twin: Supervised Metastases Detection From Consecutive Structured Radiology Reports
The development of digital cancer twins relies on the capture of high-resolution representations of individual cancer patients throughout the course of their treatment. Our research aims to improve the detection of metastatic disease over time from structured radiology reports by exposing prediction...
Autores principales: | , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8924403/ https://www.ncbi.nlm.nih.gov/pubmed/35310959 http://dx.doi.org/10.3389/frai.2022.826402 |
_version_ | 1784669847660003328 |
---|---|
author | Batch, Karen E. Yue, Jianwei Darcovich, Alex Lupton, Kaelan Liu, Corinne C. Woodlock, David P. El Amine, Mohammad Ali K. Causa-Andrieu, Pamela I. Gazit, Lior Nguyen, Gary H. Zulkernine, Farhana Do, Richard K. G. Simpson, Amber L. |
author_facet | Batch, Karen E. Yue, Jianwei Darcovich, Alex Lupton, Kaelan Liu, Corinne C. Woodlock, David P. El Amine, Mohammad Ali K. Causa-Andrieu, Pamela I. Gazit, Lior Nguyen, Gary H. Zulkernine, Farhana Do, Richard K. G. Simpson, Amber L. |
author_sort | Batch, Karen E. |
collection | PubMed |
description | The development of digital cancer twins relies on the capture of high-resolution representations of individual cancer patients throughout the course of their treatment. Our research aims to improve the detection of metastatic disease over time from structured radiology reports by exposing prediction models to historical information. We demonstrate that Natural language processing (NLP) can generate better weak labels for semi-supervised classification of computed tomography (CT) reports when it is exposed to consecutive reports through a patient's treatment history. Around 714,454 structured radiology reports from Memorial Sloan Kettering Cancer Center adhering to a standardized departmental structured template were used for model development with a subset of the reports included for validation. To develop the models, a subset of the reports was curated for ground-truth: 7,732 total reports in the lung metastases dataset from 867 individual patients; 2,777 reports in the liver metastases dataset from 315 patients; and 4,107 reports in the adrenal metastases dataset from 404 patients. We use NLP to extract and encode important features from the structured text reports, which are then used to develop, train, and validate models. Three models—a simple convolutional neural network (CNN), a CNN augmented with an attention layer, and a recurrent neural network (RNN)—were developed to classify the type of metastatic disease and validated against the ground truth labels. The models use features from consecutive structured text radiology reports of a patient to predict the presence of metastatic disease in the reports. A single-report model, previously developed to analyze one report instead of multiple past reports, is included and the results from all four models are compared based on accuracy, precision, recall, and F1-score. The best model is used to label all 714,454 reports to generate metastases maps. Our results suggest that NLP models can extract cancer progression patterns from multiple consecutive reports and predict the presence of metastatic disease in multiple organs with higher performance when compared with a single-report-based prediction. It demonstrates a promising automated approach to label large numbers of radiology reports without involving human experts in a time- and cost-effective manner and enables tracking of cancer progression over time. |
format | Online Article Text |
id | pubmed-8924403 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-89244032022-03-17 Developing a Cancer Digital Twin: Supervised Metastases Detection From Consecutive Structured Radiology Reports Batch, Karen E. Yue, Jianwei Darcovich, Alex Lupton, Kaelan Liu, Corinne C. Woodlock, David P. El Amine, Mohammad Ali K. Causa-Andrieu, Pamela I. Gazit, Lior Nguyen, Gary H. Zulkernine, Farhana Do, Richard K. G. Simpson, Amber L. Front Artif Intell Artificial Intelligence The development of digital cancer twins relies on the capture of high-resolution representations of individual cancer patients throughout the course of their treatment. Our research aims to improve the detection of metastatic disease over time from structured radiology reports by exposing prediction models to historical information. We demonstrate that Natural language processing (NLP) can generate better weak labels for semi-supervised classification of computed tomography (CT) reports when it is exposed to consecutive reports through a patient's treatment history. Around 714,454 structured radiology reports from Memorial Sloan Kettering Cancer Center adhering to a standardized departmental structured template were used for model development with a subset of the reports included for validation. To develop the models, a subset of the reports was curated for ground-truth: 7,732 total reports in the lung metastases dataset from 867 individual patients; 2,777 reports in the liver metastases dataset from 315 patients; and 4,107 reports in the adrenal metastases dataset from 404 patients. We use NLP to extract and encode important features from the structured text reports, which are then used to develop, train, and validate models. Three models—a simple convolutional neural network (CNN), a CNN augmented with an attention layer, and a recurrent neural network (RNN)—were developed to classify the type of metastatic disease and validated against the ground truth labels. The models use features from consecutive structured text radiology reports of a patient to predict the presence of metastatic disease in the reports. A single-report model, previously developed to analyze one report instead of multiple past reports, is included and the results from all four models are compared based on accuracy, precision, recall, and F1-score. The best model is used to label all 714,454 reports to generate metastases maps. Our results suggest that NLP models can extract cancer progression patterns from multiple consecutive reports and predict the presence of metastatic disease in multiple organs with higher performance when compared with a single-report-based prediction. It demonstrates a promising automated approach to label large numbers of radiology reports without involving human experts in a time- and cost-effective manner and enables tracking of cancer progression over time. Frontiers Media S.A. 2022-03-02 /pmc/articles/PMC8924403/ /pubmed/35310959 http://dx.doi.org/10.3389/frai.2022.826402 Text en Copyright © 2022 Batch, Yue, Darcovich, Lupton, Liu, Woodlock, El Amine, Causa-Andrieu, Gazit, Nguyen, Zulkernine, Do and Simpson. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Artificial Intelligence Batch, Karen E. Yue, Jianwei Darcovich, Alex Lupton, Kaelan Liu, Corinne C. Woodlock, David P. El Amine, Mohammad Ali K. Causa-Andrieu, Pamela I. Gazit, Lior Nguyen, Gary H. Zulkernine, Farhana Do, Richard K. G. Simpson, Amber L. Developing a Cancer Digital Twin: Supervised Metastases Detection From Consecutive Structured Radiology Reports |
title | Developing a Cancer Digital Twin: Supervised Metastases Detection From Consecutive Structured Radiology Reports |
title_full | Developing a Cancer Digital Twin: Supervised Metastases Detection From Consecutive Structured Radiology Reports |
title_fullStr | Developing a Cancer Digital Twin: Supervised Metastases Detection From Consecutive Structured Radiology Reports |
title_full_unstemmed | Developing a Cancer Digital Twin: Supervised Metastases Detection From Consecutive Structured Radiology Reports |
title_short | Developing a Cancer Digital Twin: Supervised Metastases Detection From Consecutive Structured Radiology Reports |
title_sort | developing a cancer digital twin: supervised metastases detection from consecutive structured radiology reports |
topic | Artificial Intelligence |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8924403/ https://www.ncbi.nlm.nih.gov/pubmed/35310959 http://dx.doi.org/10.3389/frai.2022.826402 |
work_keys_str_mv | AT batchkarene developingacancerdigitaltwinsupervisedmetastasesdetectionfromconsecutivestructuredradiologyreports AT yuejianwei developingacancerdigitaltwinsupervisedmetastasesdetectionfromconsecutivestructuredradiologyreports AT darcovichalex developingacancerdigitaltwinsupervisedmetastasesdetectionfromconsecutivestructuredradiologyreports AT luptonkaelan developingacancerdigitaltwinsupervisedmetastasesdetectionfromconsecutivestructuredradiologyreports AT liucorinnec developingacancerdigitaltwinsupervisedmetastasesdetectionfromconsecutivestructuredradiologyreports AT woodlockdavidp developingacancerdigitaltwinsupervisedmetastasesdetectionfromconsecutivestructuredradiologyreports AT elaminemohammadalik developingacancerdigitaltwinsupervisedmetastasesdetectionfromconsecutivestructuredradiologyreports AT causaandrieupamelai developingacancerdigitaltwinsupervisedmetastasesdetectionfromconsecutivestructuredradiologyreports AT gazitlior developingacancerdigitaltwinsupervisedmetastasesdetectionfromconsecutivestructuredradiologyreports AT nguyengaryh developingacancerdigitaltwinsupervisedmetastasesdetectionfromconsecutivestructuredradiologyreports AT zulkerninefarhana developingacancerdigitaltwinsupervisedmetastasesdetectionfromconsecutivestructuredradiologyreports AT dorichardkg developingacancerdigitaltwinsupervisedmetastasesdetectionfromconsecutivestructuredradiologyreports AT simpsonamberl developingacancerdigitaltwinsupervisedmetastasesdetectionfromconsecutivestructuredradiologyreports |