Cargando…

Investigating Query Expansion and Coreference Resolution in Question Answering on BERT

The Bidirectional Encoder Representations from Transformers (BERT) model produces state-of-the-art results in many question answering (QA) datasets, including the Stanford Question Answering Dataset (SQuAD). This paper presents a query expansion (QE) method that identifies good terms from input ques...

Descripción completa

Detalles Bibliográficos
Autores principales:	Bhattacharjee, Santanu, Haque, Rejwanul, de Buy Wenniger, Gideon Maillette, Way, Andy
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	2020
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7298170/ http://dx.doi.org/10.1007/978-3-030-51310-8_5

_version_	1783547161229656064
author	Bhattacharjee, Santanu Haque, Rejwanul de Buy Wenniger, Gideon Maillette Way, Andy
author_facet	Bhattacharjee, Santanu Haque, Rejwanul de Buy Wenniger, Gideon Maillette Way, Andy
author_sort	Bhattacharjee, Santanu
collection	PubMed
description	The Bidirectional Encoder Representations from Transformers (BERT) model produces state-of-the-art results in many question answering (QA) datasets, including the Stanford Question Answering Dataset (SQuAD). This paper presents a query expansion (QE) method that identifies good terms from input questions, extracts synonyms for the good terms using a widely-used language resource, WordNet, and selects the most relevant synonyms from the list of extracted synonyms. The paper also introduces a novel QE method that produces many alternative sequences for a given input question using same-language machine translation (MT). Furthermore, we use a coreference resolution (CR) technique to identify anaphors or cataphors in paragraphs and substitute them with the original referents. We found that the QA system with this simple CR technique significantly outperforms the BERT baseline in a QA task. We also found that our best-performing QA system is the one that applies these three preprocessing methods (two QE and CR methods) together to BERT, which produces an excellent [Formula: see text] score (89.8 [Formula: see text] points) in a QA task. Further, we present a comparative analysis on the performances of the BERT QA models taking a variety of criteria into account, and demonstrate our findings in the answer span prediction task.
format	Online Article Text
id	pubmed-7298170
institution	National Center for Biotechnology Information
language	English
publishDate	2020
record_format	MEDLINE/PubMed
spelling	pubmed-72981702020-06-17 Investigating Query Expansion and Coreference Resolution in Question Answering on BERT Bhattacharjee, Santanu Haque, Rejwanul de Buy Wenniger, Gideon Maillette Way, Andy Natural Language Processing and Information Systems Article The Bidirectional Encoder Representations from Transformers (BERT) model produces state-of-the-art results in many question answering (QA) datasets, including the Stanford Question Answering Dataset (SQuAD). This paper presents a query expansion (QE) method that identifies good terms from input questions, extracts synonyms for the good terms using a widely-used language resource, WordNet, and selects the most relevant synonyms from the list of extracted synonyms. The paper also introduces a novel QE method that produces many alternative sequences for a given input question using same-language machine translation (MT). Furthermore, we use a coreference resolution (CR) technique to identify anaphors or cataphors in paragraphs and substitute them with the original referents. We found that the QA system with this simple CR technique significantly outperforms the BERT baseline in a QA task. We also found that our best-performing QA system is the one that applies these three preprocessing methods (two QE and CR methods) together to BERT, which produces an excellent [Formula: see text] score (89.8 [Formula: see text] points) in a QA task. Further, we present a comparative analysis on the performances of the BERT QA models taking a variety of criteria into account, and demonstrate our findings in the answer span prediction task. 2020-05-26 /pmc/articles/PMC7298170/ http://dx.doi.org/10.1007/978-3-030-51310-8_5 Text en © Springer Nature Switzerland AG 2020 This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic.
spellingShingle	Article Bhattacharjee, Santanu Haque, Rejwanul de Buy Wenniger, Gideon Maillette Way, Andy Investigating Query Expansion and Coreference Resolution in Question Answering on BERT
title	Investigating Query Expansion and Coreference Resolution in Question Answering on BERT
title_full	Investigating Query Expansion and Coreference Resolution in Question Answering on BERT
title_fullStr	Investigating Query Expansion and Coreference Resolution in Question Answering on BERT
title_full_unstemmed	Investigating Query Expansion and Coreference Resolution in Question Answering on BERT
title_short	Investigating Query Expansion and Coreference Resolution in Question Answering on BERT
title_sort	investigating query expansion and coreference resolution in question answering on bert
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7298170/ http://dx.doi.org/10.1007/978-3-030-51310-8_5
work_keys_str_mv	AT bhattacharjeesantanu investigatingqueryexpansionandcoreferenceresolutioninquestionansweringonbert AT haquerejwanul investigatingqueryexpansionandcoreferenceresolutioninquestionansweringonbert AT debuywennigergideonmaillette investigatingqueryexpansionandcoreferenceresolutioninquestionansweringonbert AT wayandy investigatingqueryexpansionandcoreferenceresolutioninquestionansweringonbert

Investigating Query Expansion and Coreference Resolution in Question Answering on BERT

Ejemplares similares