Cargando…
Moving toward a standardized diagnostic statement of pituitary adenoma using an information extraction model: a real-world study based on electronic medical records
PURPOSE: Diagnostic statements for pituitary adenomas (PAs) are complex and unstandardized. We aimed to determine the most commonly used elements contained in the statements and their combination patterns and variations in real-world clinical practice, with the ultimate goal of promoting standardize...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9727982/ https://www.ncbi.nlm.nih.gov/pubmed/36476365 http://dx.doi.org/10.1186/s12911-022-02031-0 |
_version_ | 1784845147469512704 |
---|---|
author | Zhou, Jingya Guo, Xiaopeng Duan, Lian Yao, Yong Shang, Yafei Wang, Yi Xing, Bing |
author_facet | Zhou, Jingya Guo, Xiaopeng Duan, Lian Yao, Yong Shang, Yafei Wang, Yi Xing, Bing |
author_sort | Zhou, Jingya |
collection | PubMed |
description | PURPOSE: Diagnostic statements for pituitary adenomas (PAs) are complex and unstandardized. We aimed to determine the most commonly used elements contained in the statements and their combination patterns and variations in real-world clinical practice, with the ultimate goal of promoting standardized diagnostic recording and establishing an efficient element extraction process. METHODS: Patient medical records from 2012 to 2020 that included PA among the first three diagnoses were included. After manually labeling the elements in the diagnostic texts, we obtained element types and training sets, according to which an information extraction model was constructed based on the word segmentation model “Jieba” to extract information contained in the remaining diagnostic texts. RESULTS: A total of 576 different diagnostic statements from 4010 texts of 3770 medical records were enrolled in the analysis. The first ten diagnostic elements related to PA were histopathology, tumor location, endocrine status, tumor size, invasiveness, recurrence, diagnostic confirmation, Knosp grade, residual tumor, and refractoriness. The automated extraction model achieved F1-scores that reached 100% for all ten elements in the second round and 97.3–100.0% in the test set consisting of an additional 532 diagnostic texts. Tumor location, endocrine status, histopathology, and tumor size were the most commonly used elements, and diagnoses composed of the above elements were the most frequent. Endocrine status had the greatest expression variability, followed by Knosp grade. Among all the terms, the percentage of loss of tumor size was among the highest (21%). Among statements where the principal diagnoses were PAs, 18.6% did not have information on tumor size, while for those with other diagnoses, this percentage rose to 48% (P < 0.001). CONCLUSION: Standardization of the diagnostic statement for PAs is unsatisfactory in real-world clinical practice. This study could help standardize a structured pattern for PA diagnosis and establish a foundation for research-friendly, high-quality clinical information extraction. |
format | Online Article Text |
id | pubmed-9727982 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-97279822022-12-08 Moving toward a standardized diagnostic statement of pituitary adenoma using an information extraction model: a real-world study based on electronic medical records Zhou, Jingya Guo, Xiaopeng Duan, Lian Yao, Yong Shang, Yafei Wang, Yi Xing, Bing BMC Med Inform Decis Mak Research PURPOSE: Diagnostic statements for pituitary adenomas (PAs) are complex and unstandardized. We aimed to determine the most commonly used elements contained in the statements and their combination patterns and variations in real-world clinical practice, with the ultimate goal of promoting standardized diagnostic recording and establishing an efficient element extraction process. METHODS: Patient medical records from 2012 to 2020 that included PA among the first three diagnoses were included. After manually labeling the elements in the diagnostic texts, we obtained element types and training sets, according to which an information extraction model was constructed based on the word segmentation model “Jieba” to extract information contained in the remaining diagnostic texts. RESULTS: A total of 576 different diagnostic statements from 4010 texts of 3770 medical records were enrolled in the analysis. The first ten diagnostic elements related to PA were histopathology, tumor location, endocrine status, tumor size, invasiveness, recurrence, diagnostic confirmation, Knosp grade, residual tumor, and refractoriness. The automated extraction model achieved F1-scores that reached 100% for all ten elements in the second round and 97.3–100.0% in the test set consisting of an additional 532 diagnostic texts. Tumor location, endocrine status, histopathology, and tumor size were the most commonly used elements, and diagnoses composed of the above elements were the most frequent. Endocrine status had the greatest expression variability, followed by Knosp grade. Among all the terms, the percentage of loss of tumor size was among the highest (21%). Among statements where the principal diagnoses were PAs, 18.6% did not have information on tumor size, while for those with other diagnoses, this percentage rose to 48% (P < 0.001). CONCLUSION: Standardization of the diagnostic statement for PAs is unsatisfactory in real-world clinical practice. This study could help standardize a structured pattern for PA diagnosis and establish a foundation for research-friendly, high-quality clinical information extraction. BioMed Central 2022-12-07 /pmc/articles/PMC9727982/ /pubmed/36476365 http://dx.doi.org/10.1186/s12911-022-02031-0 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Zhou, Jingya Guo, Xiaopeng Duan, Lian Yao, Yong Shang, Yafei Wang, Yi Xing, Bing Moving toward a standardized diagnostic statement of pituitary adenoma using an information extraction model: a real-world study based on electronic medical records |
title | Moving toward a standardized diagnostic statement of pituitary adenoma using an information extraction model: a real-world study based on electronic medical records |
title_full | Moving toward a standardized diagnostic statement of pituitary adenoma using an information extraction model: a real-world study based on electronic medical records |
title_fullStr | Moving toward a standardized diagnostic statement of pituitary adenoma using an information extraction model: a real-world study based on electronic medical records |
title_full_unstemmed | Moving toward a standardized diagnostic statement of pituitary adenoma using an information extraction model: a real-world study based on electronic medical records |
title_short | Moving toward a standardized diagnostic statement of pituitary adenoma using an information extraction model: a real-world study based on electronic medical records |
title_sort | moving toward a standardized diagnostic statement of pituitary adenoma using an information extraction model: a real-world study based on electronic medical records |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9727982/ https://www.ncbi.nlm.nih.gov/pubmed/36476365 http://dx.doi.org/10.1186/s12911-022-02031-0 |
work_keys_str_mv | AT zhoujingya movingtowardastandardizeddiagnosticstatementofpituitaryadenomausinganinformationextractionmodelarealworldstudybasedonelectronicmedicalrecords AT guoxiaopeng movingtowardastandardizeddiagnosticstatementofpituitaryadenomausinganinformationextractionmodelarealworldstudybasedonelectronicmedicalrecords AT duanlian movingtowardastandardizeddiagnosticstatementofpituitaryadenomausinganinformationextractionmodelarealworldstudybasedonelectronicmedicalrecords AT yaoyong movingtowardastandardizeddiagnosticstatementofpituitaryadenomausinganinformationextractionmodelarealworldstudybasedonelectronicmedicalrecords AT shangyafei movingtowardastandardizeddiagnosticstatementofpituitaryadenomausinganinformationextractionmodelarealworldstudybasedonelectronicmedicalrecords AT wangyi movingtowardastandardizeddiagnosticstatementofpituitaryadenomausinganinformationextractionmodelarealworldstudybasedonelectronicmedicalrecords AT xingbing movingtowardastandardizeddiagnosticstatementofpituitaryadenomausinganinformationextractionmodelarealworldstudybasedonelectronicmedicalrecords |