Cargando…

A Pilot Study of Biomedical Text Comprehension using an Attention-Based Deep Neural Reader: Design and Experimental Analysis

BACKGROUND: With the development of artificial intelligence (AI) technology centered on deep-learning, the computer has evolved to a point where it can read a given text and answer a question based on the context of the text. Such a specific task is known as the task of machine comprehension. Existi...

Descripción completa

Detalles Bibliográficos
Autores principales: Kim, Seongsoon, Park, Donghyeon, Choi, Yonghwa, Lee, Kyubum, Kim, Byounggun, Jeon, Minji, Kim, Jihye, Tan, Aik Choon, Kang, Jaewoo
Formato: Online Artículo Texto
Lenguaje:English
Publicado: JMIR Publications 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5783222/
https://www.ncbi.nlm.nih.gov/pubmed/29305341
http://dx.doi.org/10.2196/medinform.8751
_version_ 1783295250344706048
author Kim, Seongsoon
Park, Donghyeon
Choi, Yonghwa
Lee, Kyubum
Kim, Byounggun
Jeon, Minji
Kim, Jihye
Tan, Aik Choon
Kang, Jaewoo
author_facet Kim, Seongsoon
Park, Donghyeon
Choi, Yonghwa
Lee, Kyubum
Kim, Byounggun
Jeon, Minji
Kim, Jihye
Tan, Aik Choon
Kang, Jaewoo
author_sort Kim, Seongsoon
collection PubMed
description BACKGROUND: With the development of artificial intelligence (AI) technology centered on deep-learning, the computer has evolved to a point where it can read a given text and answer a question based on the context of the text. Such a specific task is known as the task of machine comprehension. Existing machine comprehension tasks mostly use datasets of general texts, such as news articles or elementary school-level storybooks. However, no attempt has been made to determine whether an up-to-date deep learning-based machine comprehension model can also process scientific literature containing expert-level knowledge, especially in the biomedical domain. OBJECTIVE: This study aims to investigate whether a machine comprehension model can process biomedical articles as well as general texts. Since there is no dataset for the biomedical literature comprehension task, our work includes generating a large-scale question answering dataset using PubMed and manually evaluating the generated dataset. METHODS: We present an attention-based deep neural model tailored to the biomedical domain. To further enhance the performance of our model, we used a pretrained word vector and biomedical entity type embedding. We also developed an ensemble method of combining the results of several independent models to reduce the variance of the answers from the models. RESULTS: The experimental results showed that our proposed deep neural network model outperformed the baseline model by more than 7% on the new dataset. We also evaluated human performance on the new dataset. The human evaluation result showed that our deep neural model outperformed humans in comprehension by 22% on average. CONCLUSIONS: In this work, we introduced a new task of machine comprehension in the biomedical domain using a deep neural model. Since there was no large-scale dataset for training deep neural models in the biomedical domain, we created the new cloze-style datasets Biomedical Knowledge Comprehension Title (BMKC_T) and Biomedical Knowledge Comprehension Last Sentence (BMKC_LS) (together referred to as BioMedical Knowledge Comprehension) using the PubMed corpus. The experimental results showed that the performance of our model is much higher than that of humans. We observed that our model performed consistently better regardless of the degree of difficulty of a text, whereas humans have difficulty when performing biomedical literature comprehension tasks that require expert level knowledge.
format Online
Article
Text
id pubmed-5783222
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher JMIR Publications
record_format MEDLINE/PubMed
spelling pubmed-57832222018-01-31 A Pilot Study of Biomedical Text Comprehension using an Attention-Based Deep Neural Reader: Design and Experimental Analysis Kim, Seongsoon Park, Donghyeon Choi, Yonghwa Lee, Kyubum Kim, Byounggun Jeon, Minji Kim, Jihye Tan, Aik Choon Kang, Jaewoo JMIR Med Inform Original Paper BACKGROUND: With the development of artificial intelligence (AI) technology centered on deep-learning, the computer has evolved to a point where it can read a given text and answer a question based on the context of the text. Such a specific task is known as the task of machine comprehension. Existing machine comprehension tasks mostly use datasets of general texts, such as news articles or elementary school-level storybooks. However, no attempt has been made to determine whether an up-to-date deep learning-based machine comprehension model can also process scientific literature containing expert-level knowledge, especially in the biomedical domain. OBJECTIVE: This study aims to investigate whether a machine comprehension model can process biomedical articles as well as general texts. Since there is no dataset for the biomedical literature comprehension task, our work includes generating a large-scale question answering dataset using PubMed and manually evaluating the generated dataset. METHODS: We present an attention-based deep neural model tailored to the biomedical domain. To further enhance the performance of our model, we used a pretrained word vector and biomedical entity type embedding. We also developed an ensemble method of combining the results of several independent models to reduce the variance of the answers from the models. RESULTS: The experimental results showed that our proposed deep neural network model outperformed the baseline model by more than 7% on the new dataset. We also evaluated human performance on the new dataset. The human evaluation result showed that our deep neural model outperformed humans in comprehension by 22% on average. CONCLUSIONS: In this work, we introduced a new task of machine comprehension in the biomedical domain using a deep neural model. Since there was no large-scale dataset for training deep neural models in the biomedical domain, we created the new cloze-style datasets Biomedical Knowledge Comprehension Title (BMKC_T) and Biomedical Knowledge Comprehension Last Sentence (BMKC_LS) (together referred to as BioMedical Knowledge Comprehension) using the PubMed corpus. The experimental results showed that the performance of our model is much higher than that of humans. We observed that our model performed consistently better regardless of the degree of difficulty of a text, whereas humans have difficulty when performing biomedical literature comprehension tasks that require expert level knowledge. JMIR Publications 2018-01-05 /pmc/articles/PMC5783222/ /pubmed/29305341 http://dx.doi.org/10.2196/medinform.8751 Text en ©Seongsoon Kim, Donghyeon Park, Yonghwa Choi, Kyubum Lee, Byounggun Kim, Minji Jeon, Jihye Kim, Aik Choon Tan, Jaewoo Kang. Originally published in JMIR Medical Informatics (http://medinform.jmir.org), 05.01.2018. https://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Medical Informatics, is properly cited. The complete bibliographic information, a link to the original publication on http://medinform.jmir.org/, as well as this copyright and license information must be included.
spellingShingle Original Paper
Kim, Seongsoon
Park, Donghyeon
Choi, Yonghwa
Lee, Kyubum
Kim, Byounggun
Jeon, Minji
Kim, Jihye
Tan, Aik Choon
Kang, Jaewoo
A Pilot Study of Biomedical Text Comprehension using an Attention-Based Deep Neural Reader: Design and Experimental Analysis
title A Pilot Study of Biomedical Text Comprehension using an Attention-Based Deep Neural Reader: Design and Experimental Analysis
title_full A Pilot Study of Biomedical Text Comprehension using an Attention-Based Deep Neural Reader: Design and Experimental Analysis
title_fullStr A Pilot Study of Biomedical Text Comprehension using an Attention-Based Deep Neural Reader: Design and Experimental Analysis
title_full_unstemmed A Pilot Study of Biomedical Text Comprehension using an Attention-Based Deep Neural Reader: Design and Experimental Analysis
title_short A Pilot Study of Biomedical Text Comprehension using an Attention-Based Deep Neural Reader: Design and Experimental Analysis
title_sort pilot study of biomedical text comprehension using an attention-based deep neural reader: design and experimental analysis
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5783222/
https://www.ncbi.nlm.nih.gov/pubmed/29305341
http://dx.doi.org/10.2196/medinform.8751
work_keys_str_mv AT kimseongsoon apilotstudyofbiomedicaltextcomprehensionusinganattentionbaseddeepneuralreaderdesignandexperimentalanalysis
AT parkdonghyeon apilotstudyofbiomedicaltextcomprehensionusinganattentionbaseddeepneuralreaderdesignandexperimentalanalysis
AT choiyonghwa apilotstudyofbiomedicaltextcomprehensionusinganattentionbaseddeepneuralreaderdesignandexperimentalanalysis
AT leekyubum apilotstudyofbiomedicaltextcomprehensionusinganattentionbaseddeepneuralreaderdesignandexperimentalanalysis
AT kimbyounggun apilotstudyofbiomedicaltextcomprehensionusinganattentionbaseddeepneuralreaderdesignandexperimentalanalysis
AT jeonminji apilotstudyofbiomedicaltextcomprehensionusinganattentionbaseddeepneuralreaderdesignandexperimentalanalysis
AT kimjihye apilotstudyofbiomedicaltextcomprehensionusinganattentionbaseddeepneuralreaderdesignandexperimentalanalysis
AT tanaikchoon apilotstudyofbiomedicaltextcomprehensionusinganattentionbaseddeepneuralreaderdesignandexperimentalanalysis
AT kangjaewoo apilotstudyofbiomedicaltextcomprehensionusinganattentionbaseddeepneuralreaderdesignandexperimentalanalysis
AT kimseongsoon pilotstudyofbiomedicaltextcomprehensionusinganattentionbaseddeepneuralreaderdesignandexperimentalanalysis
AT parkdonghyeon pilotstudyofbiomedicaltextcomprehensionusinganattentionbaseddeepneuralreaderdesignandexperimentalanalysis
AT choiyonghwa pilotstudyofbiomedicaltextcomprehensionusinganattentionbaseddeepneuralreaderdesignandexperimentalanalysis
AT leekyubum pilotstudyofbiomedicaltextcomprehensionusinganattentionbaseddeepneuralreaderdesignandexperimentalanalysis
AT kimbyounggun pilotstudyofbiomedicaltextcomprehensionusinganattentionbaseddeepneuralreaderdesignandexperimentalanalysis
AT jeonminji pilotstudyofbiomedicaltextcomprehensionusinganattentionbaseddeepneuralreaderdesignandexperimentalanalysis
AT kimjihye pilotstudyofbiomedicaltextcomprehensionusinganattentionbaseddeepneuralreaderdesignandexperimentalanalysis
AT tanaikchoon pilotstudyofbiomedicaltextcomprehensionusinganattentionbaseddeepneuralreaderdesignandexperimentalanalysis
AT kangjaewoo pilotstudyofbiomedicaltextcomprehensionusinganattentionbaseddeepneuralreaderdesignandexperimentalanalysis