Cargando…
IARNN-Based Semantic-Containing Double-Level Embedding Bi-LSTM for Question-and-Answer Matching
We propose a novel end-to-end approach, namely, the semantic-containing double-level embedding Bi-LSTM model (SCDE-Bi-LSTM), to solve the three key problems of Q&A matching in the Chinese medical field. In the similarity calculation of the Q&A core module, we propose a text similarity calcul...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Hindawi
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6421739/ https://www.ncbi.nlm.nih.gov/pubmed/30944556 http://dx.doi.org/10.1155/2019/6074840 |
_version_ | 1783404287016042496 |
---|---|
author | Xiong, Chang-zhu Su, Minglian |
author_facet | Xiong, Chang-zhu Su, Minglian |
author_sort | Xiong, Chang-zhu |
collection | PubMed |
description | We propose a novel end-to-end approach, namely, the semantic-containing double-level embedding Bi-LSTM model (SCDE-Bi-LSTM), to solve the three key problems of Q&A matching in the Chinese medical field. In the similarity calculation of the Q&A core module, we propose a text similarity calculation method that contains semantic information, to solve the problem that previous Q&A methods do not incorporate the deep information of a sentence into the similarity calculations. For the sentence vector representation module, we present a double-level embedding sentence representation method to reduce the error caused by Chinese medical word segmentation. In addition, due to the problem of the attention mechanism tending to cause backward deviation of the features, we propose an improved algorithm based on Bi-LSTM in the feature extraction stage. The Q&A framework proposed in this paper not only retains important timing features but also loses low-frequency features and noise. Additionally, it is applicable to different domains. To verify the framework, extensive Chinese medical Q&A corpora are created. We run several state-of-the-art Q&A methods as contrastive experiments on the medical corpora and the current popular insuranceQA dataset under different performance measures. The experimental results on the medical corpora show that our framework significantly outperforms several strong baselines and achieves an improvement of top-1 accuracy of up to 14%, reaching 79.15%. |
format | Online Article Text |
id | pubmed-6421739 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | Hindawi |
record_format | MEDLINE/PubMed |
spelling | pubmed-64217392019-04-03 IARNN-Based Semantic-Containing Double-Level Embedding Bi-LSTM for Question-and-Answer Matching Xiong, Chang-zhu Su, Minglian Comput Intell Neurosci Research Article We propose a novel end-to-end approach, namely, the semantic-containing double-level embedding Bi-LSTM model (SCDE-Bi-LSTM), to solve the three key problems of Q&A matching in the Chinese medical field. In the similarity calculation of the Q&A core module, we propose a text similarity calculation method that contains semantic information, to solve the problem that previous Q&A methods do not incorporate the deep information of a sentence into the similarity calculations. For the sentence vector representation module, we present a double-level embedding sentence representation method to reduce the error caused by Chinese medical word segmentation. In addition, due to the problem of the attention mechanism tending to cause backward deviation of the features, we propose an improved algorithm based on Bi-LSTM in the feature extraction stage. The Q&A framework proposed in this paper not only retains important timing features but also loses low-frequency features and noise. Additionally, it is applicable to different domains. To verify the framework, extensive Chinese medical Q&A corpora are created. We run several state-of-the-art Q&A methods as contrastive experiments on the medical corpora and the current popular insuranceQA dataset under different performance measures. The experimental results on the medical corpora show that our framework significantly outperforms several strong baselines and achieves an improvement of top-1 accuracy of up to 14%, reaching 79.15%. Hindawi 2019-03-03 /pmc/articles/PMC6421739/ /pubmed/30944556 http://dx.doi.org/10.1155/2019/6074840 Text en Copyright © 2019 Chang-zhu Xiong and Minglian Su. http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Xiong, Chang-zhu Su, Minglian IARNN-Based Semantic-Containing Double-Level Embedding Bi-LSTM for Question-and-Answer Matching |
title | IARNN-Based Semantic-Containing Double-Level Embedding Bi-LSTM for Question-and-Answer Matching |
title_full | IARNN-Based Semantic-Containing Double-Level Embedding Bi-LSTM for Question-and-Answer Matching |
title_fullStr | IARNN-Based Semantic-Containing Double-Level Embedding Bi-LSTM for Question-and-Answer Matching |
title_full_unstemmed | IARNN-Based Semantic-Containing Double-Level Embedding Bi-LSTM for Question-and-Answer Matching |
title_short | IARNN-Based Semantic-Containing Double-Level Embedding Bi-LSTM for Question-and-Answer Matching |
title_sort | iarnn-based semantic-containing double-level embedding bi-lstm for question-and-answer matching |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6421739/ https://www.ncbi.nlm.nih.gov/pubmed/30944556 http://dx.doi.org/10.1155/2019/6074840 |
work_keys_str_mv | AT xiongchangzhu iarnnbasedsemanticcontainingdoublelevelembeddingbilstmforquestionandanswermatching AT suminglian iarnnbasedsemanticcontainingdoublelevelembeddingbilstmforquestionandanswermatching |