Cargando…

DeepSSR: a deep learning system for structured recognition of text images from unstructured paper-based medical reports

BACKGROUND: Complete electronic health records (EHRs) are not often available, because information barriers are caused by differences in the level of informatization and the type of the EHR system. Therefore, we aimed to develop a deep learning system [deep learning system for structured recognition...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Hao, Wang, Huijin, Bai, Jieyun, Lu, Yaosheng, Long, Shun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: AME Publishing Company 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9358495/
https://www.ncbi.nlm.nih.gov/pubmed/35957704
http://dx.doi.org/10.21037/atm-21-6672
_version_ 1784763942246023168
author Liu, Hao
Wang, Huijin
Bai, Jieyun
Lu, Yaosheng
Long, Shun
author_facet Liu, Hao
Wang, Huijin
Bai, Jieyun
Lu, Yaosheng
Long, Shun
author_sort Liu, Hao
collection PubMed
description BACKGROUND: Complete electronic health records (EHRs) are not often available, because information barriers are caused by differences in the level of informatization and the type of the EHR system. Therefore, we aimed to develop a deep learning system [deep learning system for structured recognition of text images from unstructured paper-based medical reports (DeepSSR)] for structured recognition of text images from unstructured paper-based medical reports (UPBMRs) to help physicians solve the data-sharing problem. METHODS: UPBMR images were firstly preprocessed through binarization, image correction, and image segmentation. Next, the table area was detected with a lightweight network (i.e., the proposed YOLOv3-MobileNet model). In addition, the text of the table area was detected and recognized with the model based on differentiable binarization (DB) and convolutional recurrent neural network (CRNN). Finally, the recognized text was structured according to its row and column coordinates. DeepSSR was trained and validated on our dataset with 4,221 UPBMR images which were randomly split into training, validation, and testing sets in a ratio of 8:1:1. RESULTS: DeepSSR achieved a high accuracy of 91.10% and a speed of 0.668 s per image. In the system, the proposed YOLOv3-MobileNet model for table detection achieved a precision of 97.8% and a speed of 0.006 s per image. CONCLUSIONS: DeepSSR has high accuracy and fast speed in structured recognition of text based on UPBMR images. This system may help solve the data-sharing problem due to information barriers between hospitals with different EHR systems.
format Online
Article
Text
id pubmed-9358495
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher AME Publishing Company
record_format MEDLINE/PubMed
spelling pubmed-93584952022-08-10 DeepSSR: a deep learning system for structured recognition of text images from unstructured paper-based medical reports Liu, Hao Wang, Huijin Bai, Jieyun Lu, Yaosheng Long, Shun Ann Transl Med Original Article BACKGROUND: Complete electronic health records (EHRs) are not often available, because information barriers are caused by differences in the level of informatization and the type of the EHR system. Therefore, we aimed to develop a deep learning system [deep learning system for structured recognition of text images from unstructured paper-based medical reports (DeepSSR)] for structured recognition of text images from unstructured paper-based medical reports (UPBMRs) to help physicians solve the data-sharing problem. METHODS: UPBMR images were firstly preprocessed through binarization, image correction, and image segmentation. Next, the table area was detected with a lightweight network (i.e., the proposed YOLOv3-MobileNet model). In addition, the text of the table area was detected and recognized with the model based on differentiable binarization (DB) and convolutional recurrent neural network (CRNN). Finally, the recognized text was structured according to its row and column coordinates. DeepSSR was trained and validated on our dataset with 4,221 UPBMR images which were randomly split into training, validation, and testing sets in a ratio of 8:1:1. RESULTS: DeepSSR achieved a high accuracy of 91.10% and a speed of 0.668 s per image. In the system, the proposed YOLOv3-MobileNet model for table detection achieved a precision of 97.8% and a speed of 0.006 s per image. CONCLUSIONS: DeepSSR has high accuracy and fast speed in structured recognition of text based on UPBMR images. This system may help solve the data-sharing problem due to information barriers between hospitals with different EHR systems. AME Publishing Company 2022-07 /pmc/articles/PMC9358495/ /pubmed/35957704 http://dx.doi.org/10.21037/atm-21-6672 Text en 2022 Annals of Translational Medicine. All rights reserved. https://creativecommons.org/licenses/by-nc-nd/4.0/Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0 (https://creativecommons.org/licenses/by-nc-nd/4.0/) .
spellingShingle Original Article
Liu, Hao
Wang, Huijin
Bai, Jieyun
Lu, Yaosheng
Long, Shun
DeepSSR: a deep learning system for structured recognition of text images from unstructured paper-based medical reports
title DeepSSR: a deep learning system for structured recognition of text images from unstructured paper-based medical reports
title_full DeepSSR: a deep learning system for structured recognition of text images from unstructured paper-based medical reports
title_fullStr DeepSSR: a deep learning system for structured recognition of text images from unstructured paper-based medical reports
title_full_unstemmed DeepSSR: a deep learning system for structured recognition of text images from unstructured paper-based medical reports
title_short DeepSSR: a deep learning system for structured recognition of text images from unstructured paper-based medical reports
title_sort deepssr: a deep learning system for structured recognition of text images from unstructured paper-based medical reports
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9358495/
https://www.ncbi.nlm.nih.gov/pubmed/35957704
http://dx.doi.org/10.21037/atm-21-6672
work_keys_str_mv AT liuhao deepssradeeplearningsystemforstructuredrecognitionoftextimagesfromunstructuredpaperbasedmedicalreports
AT wanghuijin deepssradeeplearningsystemforstructuredrecognitionoftextimagesfromunstructuredpaperbasedmedicalreports
AT baijieyun deepssradeeplearningsystemforstructuredrecognitionoftextimagesfromunstructuredpaperbasedmedicalreports
AT luyaosheng deepssradeeplearningsystemforstructuredrecognitionoftextimagesfromunstructuredpaperbasedmedicalreports
AT longshun deepssradeeplearningsystemforstructuredrecognitionoftextimagesfromunstructuredpaperbasedmedicalreports