Cargando…

Early detection of colorectal cancer by leveraging Dutch primary care consultation notes with free text embeddings

We aimed to assess the added predictive performance that free-text Dutch consultation notes provide in detecting colorectal cancer in primary care, in comparison to currently used models. We developed, evaluated and compared three prediction models for colorectal cancer (CRC) in a large primary care...

Descripción completa

Detalles Bibliográficos
Autores principales: Luik, Torec T., Abu-Hanna, Ameen, van Weert, Henk C. P. M., Schut, Martijn C.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10319709/
https://www.ncbi.nlm.nih.gov/pubmed/37402757
http://dx.doi.org/10.1038/s41598-023-37397-2
_version_ 1785068297296805888
author Luik, Torec T.
Abu-Hanna, Ameen
van Weert, Henk C. P. M.
Schut, Martijn C.
author_facet Luik, Torec T.
Abu-Hanna, Ameen
van Weert, Henk C. P. M.
Schut, Martijn C.
author_sort Luik, Torec T.
collection PubMed
description We aimed to assess the added predictive performance that free-text Dutch consultation notes provide in detecting colorectal cancer in primary care, in comparison to currently used models. We developed, evaluated and compared three prediction models for colorectal cancer (CRC) in a large primary care database with 60,641 patients. The prediction model with both known predictive features and free-text data (with TabTxt AUROC: 0.823) performs statistically significantly better (p < 0.05) than the other two models with only tabular (as used nowadays) and text data, respectively (AUROC Tab: 0.767; Txt: 0.797). The specificity of the two models that use demographics and known CRC features (with specificity Tab: 0.321; TabTxt: 0.335) are higher than that of the model with only free-text (specificity Txt: 0.234). The Txt and, to a lesser degree, TabTxt model are well calibrated, while the Tab model shows slight underprediction at both tails. As expected with an outcome prevalence below 0.01, all models show much uncalibrated predictions in the extreme upper tail (top 1%). Free-text consultation notes show promising results to improve the predictive performance over established prediction models that only use structured features. Clinical future implications for our CRC use case include that such improvement may help lowering the number of referrals for suspected CRC to medical specialists.
format Online
Article
Text
id pubmed-10319709
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-103197092023-07-06 Early detection of colorectal cancer by leveraging Dutch primary care consultation notes with free text embeddings Luik, Torec T. Abu-Hanna, Ameen van Weert, Henk C. P. M. Schut, Martijn C. Sci Rep Article We aimed to assess the added predictive performance that free-text Dutch consultation notes provide in detecting colorectal cancer in primary care, in comparison to currently used models. We developed, evaluated and compared three prediction models for colorectal cancer (CRC) in a large primary care database with 60,641 patients. The prediction model with both known predictive features and free-text data (with TabTxt AUROC: 0.823) performs statistically significantly better (p < 0.05) than the other two models with only tabular (as used nowadays) and text data, respectively (AUROC Tab: 0.767; Txt: 0.797). The specificity of the two models that use demographics and known CRC features (with specificity Tab: 0.321; TabTxt: 0.335) are higher than that of the model with only free-text (specificity Txt: 0.234). The Txt and, to a lesser degree, TabTxt model are well calibrated, while the Tab model shows slight underprediction at both tails. As expected with an outcome prevalence below 0.01, all models show much uncalibrated predictions in the extreme upper tail (top 1%). Free-text consultation notes show promising results to improve the predictive performance over established prediction models that only use structured features. Clinical future implications for our CRC use case include that such improvement may help lowering the number of referrals for suspected CRC to medical specialists. Nature Publishing Group UK 2023-07-04 /pmc/articles/PMC10319709/ /pubmed/37402757 http://dx.doi.org/10.1038/s41598-023-37397-2 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Luik, Torec T.
Abu-Hanna, Ameen
van Weert, Henk C. P. M.
Schut, Martijn C.
Early detection of colorectal cancer by leveraging Dutch primary care consultation notes with free text embeddings
title Early detection of colorectal cancer by leveraging Dutch primary care consultation notes with free text embeddings
title_full Early detection of colorectal cancer by leveraging Dutch primary care consultation notes with free text embeddings
title_fullStr Early detection of colorectal cancer by leveraging Dutch primary care consultation notes with free text embeddings
title_full_unstemmed Early detection of colorectal cancer by leveraging Dutch primary care consultation notes with free text embeddings
title_short Early detection of colorectal cancer by leveraging Dutch primary care consultation notes with free text embeddings
title_sort early detection of colorectal cancer by leveraging dutch primary care consultation notes with free text embeddings
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10319709/
https://www.ncbi.nlm.nih.gov/pubmed/37402757
http://dx.doi.org/10.1038/s41598-023-37397-2
work_keys_str_mv AT luiktorect earlydetectionofcolorectalcancerbyleveragingdutchprimarycareconsultationnoteswithfreetextembeddings
AT abuhannaameen earlydetectionofcolorectalcancerbyleveragingdutchprimarycareconsultationnoteswithfreetextembeddings
AT vanweerthenkcpm earlydetectionofcolorectalcancerbyleveragingdutchprimarycareconsultationnoteswithfreetextembeddings
AT schutmartijnc earlydetectionofcolorectalcancerbyleveragingdutchprimarycareconsultationnoteswithfreetextembeddings