Cargando…
An explainable CNN approach for medical codes prediction from clinical text
BACKGROUND: Clinical notes are unstructured text documents generated by clinicians during patient encounters, generally are annotated with International Classification of Diseases (ICD) codes, which give formatted information about the diagnosis and treatment. ICD code has shown its potentials in ma...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8596896/ https://www.ncbi.nlm.nih.gov/pubmed/34789241 http://dx.doi.org/10.1186/s12911-021-01615-6 |
_version_ | 1784600492306857984 |
---|---|
author | Hu, Shuyuan Teng, Fei Huang, Lufei Yan, Jun Zhang, Haibo |
author_facet | Hu, Shuyuan Teng, Fei Huang, Lufei Yan, Jun Zhang, Haibo |
author_sort | Hu, Shuyuan |
collection | PubMed |
description | BACKGROUND: Clinical notes are unstructured text documents generated by clinicians during patient encounters, generally are annotated with International Classification of Diseases (ICD) codes, which give formatted information about the diagnosis and treatment. ICD code has shown its potentials in many fields, but manual coding is labor-intensive and error-prone, lead to researches of automatic coding. Two specific challenges of this task are (1) given an annotated clinical notes, the reasons behind specific diagnoses and treatments are implicit; (2) explainability is important for practical automatic coding method, the method should not only explain its prediction output but also have explainable internal mechanics. This study aims to develop an explainable CNN approach to address these two challenges. METHOD: Our key idea is that for the automatic ICD coding task, the presence of informative snippets in the clinical text that correlated with each code plays an important role in the prediction of codes, and an informative snippet can be considered as a local and low-level feature. We infer that there exists a correspondence between a convolution filter and a local and low-level feature. Base on the inference, we come up with the Shallow and Wide Attention convolutional Mechanism (SWAM) to improve the CNN-based models’ ability to learn local and low-level features for each label. RESULTS: We evaluate our approach on MIMIC-III, an open-access dataset of ICU medical records. Our approach substantially outperforms previous results on top-50 medical code prediction on MIMIC-III dataset, the precision of the worst-performing 10% labels in previous works is increased from 0% to 53% on average. We attribute this improvement to SWAM, by which the wide architecture with attention mechanism gives the model ability to more extensively learn the unique features of different codes, and we prove it by an ablation experiment. Besides, we perform manual analysis of the performance imbalance between different codes, and preliminary conclude the characteristics that determine the difficulty of learning specific codes. CONCLUSIONS: Our main contributions can be summarized into the following three: (1) We present local and low-level features, a.k.a. informative snippets play an important role in the automatic ICD coding task, and the informative snippets extracted from the clinical text provide explanations for each code. (2) We propose that there exists a correspondence between a convolution filter and a local and low-level feature. A combination of wide and shallow convolutional layer and attention layer can help the CNN-based models better learn local and low-level features. (3) We improved the precision of the worst-performing 10% labels from 0 to 53% on average. |
format | Online Article Text |
id | pubmed-8596896 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-85968962021-11-17 An explainable CNN approach for medical codes prediction from clinical text Hu, Shuyuan Teng, Fei Huang, Lufei Yan, Jun Zhang, Haibo BMC Med Inform Decis Mak Research BACKGROUND: Clinical notes are unstructured text documents generated by clinicians during patient encounters, generally are annotated with International Classification of Diseases (ICD) codes, which give formatted information about the diagnosis and treatment. ICD code has shown its potentials in many fields, but manual coding is labor-intensive and error-prone, lead to researches of automatic coding. Two specific challenges of this task are (1) given an annotated clinical notes, the reasons behind specific diagnoses and treatments are implicit; (2) explainability is important for practical automatic coding method, the method should not only explain its prediction output but also have explainable internal mechanics. This study aims to develop an explainable CNN approach to address these two challenges. METHOD: Our key idea is that for the automatic ICD coding task, the presence of informative snippets in the clinical text that correlated with each code plays an important role in the prediction of codes, and an informative snippet can be considered as a local and low-level feature. We infer that there exists a correspondence between a convolution filter and a local and low-level feature. Base on the inference, we come up with the Shallow and Wide Attention convolutional Mechanism (SWAM) to improve the CNN-based models’ ability to learn local and low-level features for each label. RESULTS: We evaluate our approach on MIMIC-III, an open-access dataset of ICU medical records. Our approach substantially outperforms previous results on top-50 medical code prediction on MIMIC-III dataset, the precision of the worst-performing 10% labels in previous works is increased from 0% to 53% on average. We attribute this improvement to SWAM, by which the wide architecture with attention mechanism gives the model ability to more extensively learn the unique features of different codes, and we prove it by an ablation experiment. Besides, we perform manual analysis of the performance imbalance between different codes, and preliminary conclude the characteristics that determine the difficulty of learning specific codes. CONCLUSIONS: Our main contributions can be summarized into the following three: (1) We present local and low-level features, a.k.a. informative snippets play an important role in the automatic ICD coding task, and the informative snippets extracted from the clinical text provide explanations for each code. (2) We propose that there exists a correspondence between a convolution filter and a local and low-level feature. A combination of wide and shallow convolutional layer and attention layer can help the CNN-based models better learn local and low-level features. (3) We improved the precision of the worst-performing 10% labels from 0 to 53% on average. BioMed Central 2021-11-16 /pmc/articles/PMC8596896/ /pubmed/34789241 http://dx.doi.org/10.1186/s12911-021-01615-6 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Hu, Shuyuan Teng, Fei Huang, Lufei Yan, Jun Zhang, Haibo An explainable CNN approach for medical codes prediction from clinical text |
title | An explainable CNN approach for medical codes prediction from clinical text |
title_full | An explainable CNN approach for medical codes prediction from clinical text |
title_fullStr | An explainable CNN approach for medical codes prediction from clinical text |
title_full_unstemmed | An explainable CNN approach for medical codes prediction from clinical text |
title_short | An explainable CNN approach for medical codes prediction from clinical text |
title_sort | explainable cnn approach for medical codes prediction from clinical text |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8596896/ https://www.ncbi.nlm.nih.gov/pubmed/34789241 http://dx.doi.org/10.1186/s12911-021-01615-6 |
work_keys_str_mv | AT hushuyuan anexplainablecnnapproachformedicalcodespredictionfromclinicaltext AT tengfei anexplainablecnnapproachformedicalcodespredictionfromclinicaltext AT huanglufei anexplainablecnnapproachformedicalcodespredictionfromclinicaltext AT yanjun anexplainablecnnapproachformedicalcodespredictionfromclinicaltext AT zhanghaibo anexplainablecnnapproachformedicalcodespredictionfromclinicaltext AT hushuyuan explainablecnnapproachformedicalcodespredictionfromclinicaltext AT tengfei explainablecnnapproachformedicalcodespredictionfromclinicaltext AT huanglufei explainablecnnapproachformedicalcodespredictionfromclinicaltext AT yanjun explainablecnnapproachformedicalcodespredictionfromclinicaltext AT zhanghaibo explainablecnnapproachformedicalcodespredictionfromclinicaltext |