Cargando…
Investigating Query Expansion and Coreference Resolution in Question Answering on BERT
The Bidirectional Encoder Representations from Transformers (BERT) model produces state-of-the-art results in many question answering (QA) datasets, including the Stanford Question Answering Dataset (SQuAD). This paper presents a query expansion (QE) method that identifies good terms from input ques...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7298170/ http://dx.doi.org/10.1007/978-3-030-51310-8_5 |
Sumario: | The Bidirectional Encoder Representations from Transformers (BERT) model produces state-of-the-art results in many question answering (QA) datasets, including the Stanford Question Answering Dataset (SQuAD). This paper presents a query expansion (QE) method that identifies good terms from input questions, extracts synonyms for the good terms using a widely-used language resource, WordNet, and selects the most relevant synonyms from the list of extracted synonyms. The paper also introduces a novel QE method that produces many alternative sequences for a given input question using same-language machine translation (MT). Furthermore, we use a coreference resolution (CR) technique to identify anaphors or cataphors in paragraphs and substitute them with the original referents. We found that the QA system with this simple CR technique significantly outperforms the BERT baseline in a QA task. We also found that our best-performing QA system is the one that applies these three preprocessing methods (two QE and CR methods) together to BERT, which produces an excellent [Formula: see text] score (89.8 [Formula: see text] points) in a QA task. Further, we present a comparative analysis on the performances of the BERT QA models taking a variety of criteria into account, and demonstrate our findings in the answer span prediction task. |
---|