Cargando…

Sequence tagging for biomedical extractive question answering

MOTIVATION: Current studies in extractive question answering (EQA) have modeled the single-span extraction setting, where a single answer span is a label to predict for a given question-passage pair. This setting is natural for general domain EQA as the majority of the questions in the general domai...

Descripción completa

Detalles Bibliográficos
Autores principales: Yoon, Wonjin, Jackson, Richard, Lagerberg, Aron, Kang, Jaewoo
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9344839/
https://www.ncbi.nlm.nih.gov/pubmed/35713500
http://dx.doi.org/10.1093/bioinformatics/btac397
_version_ 1784761302921510912
author Yoon, Wonjin
Jackson, Richard
Lagerberg, Aron
Kang, Jaewoo
author_facet Yoon, Wonjin
Jackson, Richard
Lagerberg, Aron
Kang, Jaewoo
author_sort Yoon, Wonjin
collection PubMed
description MOTIVATION: Current studies in extractive question answering (EQA) have modeled the single-span extraction setting, where a single answer span is a label to predict for a given question-passage pair. This setting is natural for general domain EQA as the majority of the questions in the general domain can be answered with a single span. Following general domain EQA models, current biomedical EQA (BioEQA) models utilize the single-span extraction setting with post-processing steps. RESULTS: In this article, we investigate the question distribution across the general and biomedical domains and discover biomedical questions are more likely to require list-type answers (multiple answers) than factoid-type answers (single answer). This necessitates the models capable of producing multiple answers for a question. Based on this preliminary study, we propose a sequence tagging approach for BioEQA, which is a multi-span extraction setting. Our approach directly tackles questions with a variable number of phrases as their answer and can learn to decide the number of answers for a question from training data. Our experimental results on the BioASQ 7b and 8b list-type questions outperformed the best-performing existing models without requiring post-processing steps. AVAILABILITY AND IMPLEMENTATION: Source codes and resources are freely available for download at https://github.com/dmis-lab/SeqTagQA. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-9344839
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-93448392022-08-03 Sequence tagging for biomedical extractive question answering Yoon, Wonjin Jackson, Richard Lagerberg, Aron Kang, Jaewoo Bioinformatics Original Papers MOTIVATION: Current studies in extractive question answering (EQA) have modeled the single-span extraction setting, where a single answer span is a label to predict for a given question-passage pair. This setting is natural for general domain EQA as the majority of the questions in the general domain can be answered with a single span. Following general domain EQA models, current biomedical EQA (BioEQA) models utilize the single-span extraction setting with post-processing steps. RESULTS: In this article, we investigate the question distribution across the general and biomedical domains and discover biomedical questions are more likely to require list-type answers (multiple answers) than factoid-type answers (single answer). This necessitates the models capable of producing multiple answers for a question. Based on this preliminary study, we propose a sequence tagging approach for BioEQA, which is a multi-span extraction setting. Our approach directly tackles questions with a variable number of phrases as their answer and can learn to decide the number of answers for a question from training data. Our experimental results on the BioASQ 7b and 8b list-type questions outperformed the best-performing existing models without requiring post-processing steps. AVAILABILITY AND IMPLEMENTATION: Source codes and resources are freely available for download at https://github.com/dmis-lab/SeqTagQA. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2022-06-17 /pmc/articles/PMC9344839/ /pubmed/35713500 http://dx.doi.org/10.1093/bioinformatics/btac397 Text en © The Author(s) 2022. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Papers
Yoon, Wonjin
Jackson, Richard
Lagerberg, Aron
Kang, Jaewoo
Sequence tagging for biomedical extractive question answering
title Sequence tagging for biomedical extractive question answering
title_full Sequence tagging for biomedical extractive question answering
title_fullStr Sequence tagging for biomedical extractive question answering
title_full_unstemmed Sequence tagging for biomedical extractive question answering
title_short Sequence tagging for biomedical extractive question answering
title_sort sequence tagging for biomedical extractive question answering
topic Original Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9344839/
https://www.ncbi.nlm.nih.gov/pubmed/35713500
http://dx.doi.org/10.1093/bioinformatics/btac397
work_keys_str_mv AT yoonwonjin sequencetaggingforbiomedicalextractivequestionanswering
AT jacksonrichard sequencetaggingforbiomedicalextractivequestionanswering
AT lagerbergaron sequencetaggingforbiomedicalextractivequestionanswering
AT kangjaewoo sequencetaggingforbiomedicalextractivequestionanswering