Cargando…
Generalizable and Automated Classification of TNM Stage from Pathology Reports with External Validation
Cancer staging is an essential clinical attribute informing patient prognosis and clinical trial eligibility. However, it is not routinely recorded in structured electronic health records. Here, we present a generalizable method for the automated classification of TNM stage directly from pathology r...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Cold Spring Harbor Laboratory
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10327265/ https://www.ncbi.nlm.nih.gov/pubmed/37425701 http://dx.doi.org/10.1101/2023.06.26.23291912 |
_version_ | 1785069587335741440 |
---|---|
author | Kefeli, Jenna Tatonetti, Nicholas |
author_facet | Kefeli, Jenna Tatonetti, Nicholas |
author_sort | Kefeli, Jenna |
collection | PubMed |
description | Cancer staging is an essential clinical attribute informing patient prognosis and clinical trial eligibility. However, it is not routinely recorded in structured electronic health records. Here, we present a generalizable method for the automated classification of TNM stage directly from pathology report text. We train a BERT-based model using publicly available pathology reports across approximately 7,000 patients and 23 cancer types. We explore the use of different model types, with differing input sizes, parameters, and model architectures. Our final model goes beyond term-extraction, inferring TNM stage from context when it is not included in the report text explicitly. As external validation, we test our model on almost 8,000 pathology reports from Columbia University Medical Center, finding that our trained model achieved an AU-ROC of 0.815–0.942. This suggests that our model can be applied broadly to other institutions without additional institution-specific fine-tuning. |
format | Online Article Text |
id | pubmed-10327265 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Cold Spring Harbor Laboratory |
record_format | MEDLINE/PubMed |
spelling | pubmed-103272652023-07-08 Generalizable and Automated Classification of TNM Stage from Pathology Reports with External Validation Kefeli, Jenna Tatonetti, Nicholas medRxiv Article Cancer staging is an essential clinical attribute informing patient prognosis and clinical trial eligibility. However, it is not routinely recorded in structured electronic health records. Here, we present a generalizable method for the automated classification of TNM stage directly from pathology report text. We train a BERT-based model using publicly available pathology reports across approximately 7,000 patients and 23 cancer types. We explore the use of different model types, with differing input sizes, parameters, and model architectures. Our final model goes beyond term-extraction, inferring TNM stage from context when it is not included in the report text explicitly. As external validation, we test our model on almost 8,000 pathology reports from Columbia University Medical Center, finding that our trained model achieved an AU-ROC of 0.815–0.942. This suggests that our model can be applied broadly to other institutions without additional institution-specific fine-tuning. Cold Spring Harbor Laboratory 2023-06-27 /pmc/articles/PMC10327265/ /pubmed/37425701 http://dx.doi.org/10.1101/2023.06.26.23291912 Text en https://creativecommons.org/licenses/by-nc/4.0/This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (https://creativecommons.org/licenses/by-nc/4.0/) , which allows reusers to distribute, remix, adapt, and build upon the material in any medium or format for noncommercial purposes only, and only so long as attribution is given to the creator. |
spellingShingle | Article Kefeli, Jenna Tatonetti, Nicholas Generalizable and Automated Classification of TNM Stage from Pathology Reports with External Validation |
title | Generalizable and Automated Classification of TNM Stage from Pathology Reports with External Validation |
title_full | Generalizable and Automated Classification of TNM Stage from Pathology Reports with External Validation |
title_fullStr | Generalizable and Automated Classification of TNM Stage from Pathology Reports with External Validation |
title_full_unstemmed | Generalizable and Automated Classification of TNM Stage from Pathology Reports with External Validation |
title_short | Generalizable and Automated Classification of TNM Stage from Pathology Reports with External Validation |
title_sort | generalizable and automated classification of tnm stage from pathology reports with external validation |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10327265/ https://www.ncbi.nlm.nih.gov/pubmed/37425701 http://dx.doi.org/10.1101/2023.06.26.23291912 |
work_keys_str_mv | AT kefelijenna generalizableandautomatedclassificationoftnmstagefrompathologyreportswithexternalvalidation AT tatonettinicholas generalizableandautomatedclassificationoftnmstagefrompathologyreportswithexternalvalidation |