Cargando…

IARNN-Based Semantic-Containing Double-Level Embedding Bi-LSTM for Question-and-Answer Matching

We propose a novel end-to-end approach, namely, the semantic-containing double-level embedding Bi-LSTM model (SCDE-Bi-LSTM), to solve the three key problems of Q&A matching in the Chinese medical field. In the similarity calculation of the Q&A core module, we propose a text similarity calcul...

Descripción completa

Detalles Bibliográficos
Autores principales: Xiong, Chang-zhu, Su, Minglian
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6421739/
https://www.ncbi.nlm.nih.gov/pubmed/30944556
http://dx.doi.org/10.1155/2019/6074840
_version_ 1783404287016042496
author Xiong, Chang-zhu
Su, Minglian
author_facet Xiong, Chang-zhu
Su, Minglian
author_sort Xiong, Chang-zhu
collection PubMed
description We propose a novel end-to-end approach, namely, the semantic-containing double-level embedding Bi-LSTM model (SCDE-Bi-LSTM), to solve the three key problems of Q&A matching in the Chinese medical field. In the similarity calculation of the Q&A core module, we propose a text similarity calculation method that contains semantic information, to solve the problem that previous Q&A methods do not incorporate the deep information of a sentence into the similarity calculations. For the sentence vector representation module, we present a double-level embedding sentence representation method to reduce the error caused by Chinese medical word segmentation. In addition, due to the problem of the attention mechanism tending to cause backward deviation of the features, we propose an improved algorithm based on Bi-LSTM in the feature extraction stage. The Q&A framework proposed in this paper not only retains important timing features but also loses low-frequency features and noise. Additionally, it is applicable to different domains. To verify the framework, extensive Chinese medical Q&A corpora are created. We run several state-of-the-art Q&A methods as contrastive experiments on the medical corpora and the current popular insuranceQA dataset under different performance measures. The experimental results on the medical corpora show that our framework significantly outperforms several strong baselines and achieves an improvement of top-1 accuracy of up to 14%, reaching 79.15%.
format Online
Article
Text
id pubmed-6421739
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Hindawi
record_format MEDLINE/PubMed
spelling pubmed-64217392019-04-03 IARNN-Based Semantic-Containing Double-Level Embedding Bi-LSTM for Question-and-Answer Matching Xiong, Chang-zhu Su, Minglian Comput Intell Neurosci Research Article We propose a novel end-to-end approach, namely, the semantic-containing double-level embedding Bi-LSTM model (SCDE-Bi-LSTM), to solve the three key problems of Q&A matching in the Chinese medical field. In the similarity calculation of the Q&A core module, we propose a text similarity calculation method that contains semantic information, to solve the problem that previous Q&A methods do not incorporate the deep information of a sentence into the similarity calculations. For the sentence vector representation module, we present a double-level embedding sentence representation method to reduce the error caused by Chinese medical word segmentation. In addition, due to the problem of the attention mechanism tending to cause backward deviation of the features, we propose an improved algorithm based on Bi-LSTM in the feature extraction stage. The Q&A framework proposed in this paper not only retains important timing features but also loses low-frequency features and noise. Additionally, it is applicable to different domains. To verify the framework, extensive Chinese medical Q&A corpora are created. We run several state-of-the-art Q&A methods as contrastive experiments on the medical corpora and the current popular insuranceQA dataset under different performance measures. The experimental results on the medical corpora show that our framework significantly outperforms several strong baselines and achieves an improvement of top-1 accuracy of up to 14%, reaching 79.15%. Hindawi 2019-03-03 /pmc/articles/PMC6421739/ /pubmed/30944556 http://dx.doi.org/10.1155/2019/6074840 Text en Copyright © 2019 Chang-zhu Xiong and Minglian Su. http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Xiong, Chang-zhu
Su, Minglian
IARNN-Based Semantic-Containing Double-Level Embedding Bi-LSTM for Question-and-Answer Matching
title IARNN-Based Semantic-Containing Double-Level Embedding Bi-LSTM for Question-and-Answer Matching
title_full IARNN-Based Semantic-Containing Double-Level Embedding Bi-LSTM for Question-and-Answer Matching
title_fullStr IARNN-Based Semantic-Containing Double-Level Embedding Bi-LSTM for Question-and-Answer Matching
title_full_unstemmed IARNN-Based Semantic-Containing Double-Level Embedding Bi-LSTM for Question-and-Answer Matching
title_short IARNN-Based Semantic-Containing Double-Level Embedding Bi-LSTM for Question-and-Answer Matching
title_sort iarnn-based semantic-containing double-level embedding bi-lstm for question-and-answer matching
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6421739/
https://www.ncbi.nlm.nih.gov/pubmed/30944556
http://dx.doi.org/10.1155/2019/6074840
work_keys_str_mv AT xiongchangzhu iarnnbasedsemanticcontainingdoublelevelembeddingbilstmforquestionandanswermatching
AT suminglian iarnnbasedsemanticcontainingdoublelevelembeddingbilstmforquestionandanswermatching