Cargando…
DeepSSR: a deep learning system for structured recognition of text images from unstructured paper-based medical reports
BACKGROUND: Complete electronic health records (EHRs) are not often available, because information barriers are caused by differences in the level of informatization and the type of the EHR system. Therefore, we aimed to develop a deep learning system [deep learning system for structured recognition...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
AME Publishing Company
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9358495/ https://www.ncbi.nlm.nih.gov/pubmed/35957704 http://dx.doi.org/10.21037/atm-21-6672 |
_version_ | 1784763942246023168 |
---|---|
author | Liu, Hao Wang, Huijin Bai, Jieyun Lu, Yaosheng Long, Shun |
author_facet | Liu, Hao Wang, Huijin Bai, Jieyun Lu, Yaosheng Long, Shun |
author_sort | Liu, Hao |
collection | PubMed |
description | BACKGROUND: Complete electronic health records (EHRs) are not often available, because information barriers are caused by differences in the level of informatization and the type of the EHR system. Therefore, we aimed to develop a deep learning system [deep learning system for structured recognition of text images from unstructured paper-based medical reports (DeepSSR)] for structured recognition of text images from unstructured paper-based medical reports (UPBMRs) to help physicians solve the data-sharing problem. METHODS: UPBMR images were firstly preprocessed through binarization, image correction, and image segmentation. Next, the table area was detected with a lightweight network (i.e., the proposed YOLOv3-MobileNet model). In addition, the text of the table area was detected and recognized with the model based on differentiable binarization (DB) and convolutional recurrent neural network (CRNN). Finally, the recognized text was structured according to its row and column coordinates. DeepSSR was trained and validated on our dataset with 4,221 UPBMR images which were randomly split into training, validation, and testing sets in a ratio of 8:1:1. RESULTS: DeepSSR achieved a high accuracy of 91.10% and a speed of 0.668 s per image. In the system, the proposed YOLOv3-MobileNet model for table detection achieved a precision of 97.8% and a speed of 0.006 s per image. CONCLUSIONS: DeepSSR has high accuracy and fast speed in structured recognition of text based on UPBMR images. This system may help solve the data-sharing problem due to information barriers between hospitals with different EHR systems. |
format | Online Article Text |
id | pubmed-9358495 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | AME Publishing Company |
record_format | MEDLINE/PubMed |
spelling | pubmed-93584952022-08-10 DeepSSR: a deep learning system for structured recognition of text images from unstructured paper-based medical reports Liu, Hao Wang, Huijin Bai, Jieyun Lu, Yaosheng Long, Shun Ann Transl Med Original Article BACKGROUND: Complete electronic health records (EHRs) are not often available, because information barriers are caused by differences in the level of informatization and the type of the EHR system. Therefore, we aimed to develop a deep learning system [deep learning system for structured recognition of text images from unstructured paper-based medical reports (DeepSSR)] for structured recognition of text images from unstructured paper-based medical reports (UPBMRs) to help physicians solve the data-sharing problem. METHODS: UPBMR images were firstly preprocessed through binarization, image correction, and image segmentation. Next, the table area was detected with a lightweight network (i.e., the proposed YOLOv3-MobileNet model). In addition, the text of the table area was detected and recognized with the model based on differentiable binarization (DB) and convolutional recurrent neural network (CRNN). Finally, the recognized text was structured according to its row and column coordinates. DeepSSR was trained and validated on our dataset with 4,221 UPBMR images which were randomly split into training, validation, and testing sets in a ratio of 8:1:1. RESULTS: DeepSSR achieved a high accuracy of 91.10% and a speed of 0.668 s per image. In the system, the proposed YOLOv3-MobileNet model for table detection achieved a precision of 97.8% and a speed of 0.006 s per image. CONCLUSIONS: DeepSSR has high accuracy and fast speed in structured recognition of text based on UPBMR images. This system may help solve the data-sharing problem due to information barriers between hospitals with different EHR systems. AME Publishing Company 2022-07 /pmc/articles/PMC9358495/ /pubmed/35957704 http://dx.doi.org/10.21037/atm-21-6672 Text en 2022 Annals of Translational Medicine. All rights reserved. https://creativecommons.org/licenses/by-nc-nd/4.0/Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0 (https://creativecommons.org/licenses/by-nc-nd/4.0/) . |
spellingShingle | Original Article Liu, Hao Wang, Huijin Bai, Jieyun Lu, Yaosheng Long, Shun DeepSSR: a deep learning system for structured recognition of text images from unstructured paper-based medical reports |
title | DeepSSR: a deep learning system for structured recognition of text images from unstructured paper-based medical reports |
title_full | DeepSSR: a deep learning system for structured recognition of text images from unstructured paper-based medical reports |
title_fullStr | DeepSSR: a deep learning system for structured recognition of text images from unstructured paper-based medical reports |
title_full_unstemmed | DeepSSR: a deep learning system for structured recognition of text images from unstructured paper-based medical reports |
title_short | DeepSSR: a deep learning system for structured recognition of text images from unstructured paper-based medical reports |
title_sort | deepssr: a deep learning system for structured recognition of text images from unstructured paper-based medical reports |
topic | Original Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9358495/ https://www.ncbi.nlm.nih.gov/pubmed/35957704 http://dx.doi.org/10.21037/atm-21-6672 |
work_keys_str_mv | AT liuhao deepssradeeplearningsystemforstructuredrecognitionoftextimagesfromunstructuredpaperbasedmedicalreports AT wanghuijin deepssradeeplearningsystemforstructuredrecognitionoftextimagesfromunstructuredpaperbasedmedicalreports AT baijieyun deepssradeeplearningsystemforstructuredrecognitionoftextimagesfromunstructuredpaperbasedmedicalreports AT luyaosheng deepssradeeplearningsystemforstructuredrecognitionoftextimagesfromunstructuredpaperbasedmedicalreports AT longshun deepssradeeplearningsystemforstructuredrecognitionoftextimagesfromunstructuredpaperbasedmedicalreports |