Cargando…

Moving toward a standardized diagnostic statement of pituitary adenoma using an information extraction model: a real-world study based on electronic medical records

PURPOSE: Diagnostic statements for pituitary adenomas (PAs) are complex and unstandardized. We aimed to determine the most commonly used elements contained in the statements and their combination patterns and variations in real-world clinical practice, with the ultimate goal of promoting standardize...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhou, Jingya, Guo, Xiaopeng, Duan, Lian, Yao, Yong, Shang, Yafei, Wang, Yi, Xing, Bing
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9727982/
https://www.ncbi.nlm.nih.gov/pubmed/36476365
http://dx.doi.org/10.1186/s12911-022-02031-0
_version_ 1784845147469512704
author Zhou, Jingya
Guo, Xiaopeng
Duan, Lian
Yao, Yong
Shang, Yafei
Wang, Yi
Xing, Bing
author_facet Zhou, Jingya
Guo, Xiaopeng
Duan, Lian
Yao, Yong
Shang, Yafei
Wang, Yi
Xing, Bing
author_sort Zhou, Jingya
collection PubMed
description PURPOSE: Diagnostic statements for pituitary adenomas (PAs) are complex and unstandardized. We aimed to determine the most commonly used elements contained in the statements and their combination patterns and variations in real-world clinical practice, with the ultimate goal of promoting standardized diagnostic recording and establishing an efficient element extraction process. METHODS: Patient medical records from 2012 to 2020 that included PA among the first three diagnoses were included. After manually labeling the elements in the diagnostic texts, we obtained element types and training sets, according to which an information extraction model was constructed based on the word segmentation model “Jieba” to extract information contained in the remaining diagnostic texts. RESULTS: A total of 576 different diagnostic statements from 4010 texts of 3770 medical records were enrolled in the analysis. The first ten diagnostic elements related to PA were histopathology, tumor location, endocrine status, tumor size, invasiveness, recurrence, diagnostic confirmation, Knosp grade, residual tumor, and refractoriness. The automated extraction model achieved F1-scores that reached 100% for all ten elements in the second round and 97.3–100.0% in the test set consisting of an additional 532 diagnostic texts. Tumor location, endocrine status, histopathology, and tumor size were the most commonly used elements, and diagnoses composed of the above elements were the most frequent. Endocrine status had the greatest expression variability, followed by Knosp grade. Among all the terms, the percentage of loss of tumor size was among the highest (21%). Among statements where the principal diagnoses were PAs, 18.6% did not have information on tumor size, while for those with other diagnoses, this percentage rose to 48% (P < 0.001). CONCLUSION: Standardization of the diagnostic statement for PAs is unsatisfactory in real-world clinical practice. This study could help standardize a structured pattern for PA diagnosis and establish a foundation for research-friendly, high-quality clinical information extraction.
format Online
Article
Text
id pubmed-9727982
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-97279822022-12-08 Moving toward a standardized diagnostic statement of pituitary adenoma using an information extraction model: a real-world study based on electronic medical records Zhou, Jingya Guo, Xiaopeng Duan, Lian Yao, Yong Shang, Yafei Wang, Yi Xing, Bing BMC Med Inform Decis Mak Research PURPOSE: Diagnostic statements for pituitary adenomas (PAs) are complex and unstandardized. We aimed to determine the most commonly used elements contained in the statements and their combination patterns and variations in real-world clinical practice, with the ultimate goal of promoting standardized diagnostic recording and establishing an efficient element extraction process. METHODS: Patient medical records from 2012 to 2020 that included PA among the first three diagnoses were included. After manually labeling the elements in the diagnostic texts, we obtained element types and training sets, according to which an information extraction model was constructed based on the word segmentation model “Jieba” to extract information contained in the remaining diagnostic texts. RESULTS: A total of 576 different diagnostic statements from 4010 texts of 3770 medical records were enrolled in the analysis. The first ten diagnostic elements related to PA were histopathology, tumor location, endocrine status, tumor size, invasiveness, recurrence, diagnostic confirmation, Knosp grade, residual tumor, and refractoriness. The automated extraction model achieved F1-scores that reached 100% for all ten elements in the second round and 97.3–100.0% in the test set consisting of an additional 532 diagnostic texts. Tumor location, endocrine status, histopathology, and tumor size were the most commonly used elements, and diagnoses composed of the above elements were the most frequent. Endocrine status had the greatest expression variability, followed by Knosp grade. Among all the terms, the percentage of loss of tumor size was among the highest (21%). Among statements where the principal diagnoses were PAs, 18.6% did not have information on tumor size, while for those with other diagnoses, this percentage rose to 48% (P < 0.001). CONCLUSION: Standardization of the diagnostic statement for PAs is unsatisfactory in real-world clinical practice. This study could help standardize a structured pattern for PA diagnosis and establish a foundation for research-friendly, high-quality clinical information extraction. BioMed Central 2022-12-07 /pmc/articles/PMC9727982/ /pubmed/36476365 http://dx.doi.org/10.1186/s12911-022-02031-0 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Zhou, Jingya
Guo, Xiaopeng
Duan, Lian
Yao, Yong
Shang, Yafei
Wang, Yi
Xing, Bing
Moving toward a standardized diagnostic statement of pituitary adenoma using an information extraction model: a real-world study based on electronic medical records
title Moving toward a standardized diagnostic statement of pituitary adenoma using an information extraction model: a real-world study based on electronic medical records
title_full Moving toward a standardized diagnostic statement of pituitary adenoma using an information extraction model: a real-world study based on electronic medical records
title_fullStr Moving toward a standardized diagnostic statement of pituitary adenoma using an information extraction model: a real-world study based on electronic medical records
title_full_unstemmed Moving toward a standardized diagnostic statement of pituitary adenoma using an information extraction model: a real-world study based on electronic medical records
title_short Moving toward a standardized diagnostic statement of pituitary adenoma using an information extraction model: a real-world study based on electronic medical records
title_sort moving toward a standardized diagnostic statement of pituitary adenoma using an information extraction model: a real-world study based on electronic medical records
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9727982/
https://www.ncbi.nlm.nih.gov/pubmed/36476365
http://dx.doi.org/10.1186/s12911-022-02031-0
work_keys_str_mv AT zhoujingya movingtowardastandardizeddiagnosticstatementofpituitaryadenomausinganinformationextractionmodelarealworldstudybasedonelectronicmedicalrecords
AT guoxiaopeng movingtowardastandardizeddiagnosticstatementofpituitaryadenomausinganinformationextractionmodelarealworldstudybasedonelectronicmedicalrecords
AT duanlian movingtowardastandardizeddiagnosticstatementofpituitaryadenomausinganinformationextractionmodelarealworldstudybasedonelectronicmedicalrecords
AT yaoyong movingtowardastandardizeddiagnosticstatementofpituitaryadenomausinganinformationextractionmodelarealworldstudybasedonelectronicmedicalrecords
AT shangyafei movingtowardastandardizeddiagnosticstatementofpituitaryadenomausinganinformationextractionmodelarealworldstudybasedonelectronicmedicalrecords
AT wangyi movingtowardastandardizeddiagnosticstatementofpituitaryadenomausinganinformationextractionmodelarealworldstudybasedonelectronicmedicalrecords
AT xingbing movingtowardastandardizeddiagnosticstatementofpituitaryadenomausinganinformationextractionmodelarealworldstudybasedonelectronicmedicalrecords