Cargando…
Sequence tagging for biomedical extractive question answering
MOTIVATION: Current studies in extractive question answering (EQA) have modeled the single-span extraction setting, where a single answer span is a label to predict for a given question-passage pair. This setting is natural for general domain EQA as the majority of the questions in the general domai...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9344839/ https://www.ncbi.nlm.nih.gov/pubmed/35713500 http://dx.doi.org/10.1093/bioinformatics/btac397 |
_version_ | 1784761302921510912 |
---|---|
author | Yoon, Wonjin Jackson, Richard Lagerberg, Aron Kang, Jaewoo |
author_facet | Yoon, Wonjin Jackson, Richard Lagerberg, Aron Kang, Jaewoo |
author_sort | Yoon, Wonjin |
collection | PubMed |
description | MOTIVATION: Current studies in extractive question answering (EQA) have modeled the single-span extraction setting, where a single answer span is a label to predict for a given question-passage pair. This setting is natural for general domain EQA as the majority of the questions in the general domain can be answered with a single span. Following general domain EQA models, current biomedical EQA (BioEQA) models utilize the single-span extraction setting with post-processing steps. RESULTS: In this article, we investigate the question distribution across the general and biomedical domains and discover biomedical questions are more likely to require list-type answers (multiple answers) than factoid-type answers (single answer). This necessitates the models capable of producing multiple answers for a question. Based on this preliminary study, we propose a sequence tagging approach for BioEQA, which is a multi-span extraction setting. Our approach directly tackles questions with a variable number of phrases as their answer and can learn to decide the number of answers for a question from training data. Our experimental results on the BioASQ 7b and 8b list-type questions outperformed the best-performing existing models without requiring post-processing steps. AVAILABILITY AND IMPLEMENTATION: Source codes and resources are freely available for download at https://github.com/dmis-lab/SeqTagQA. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. |
format | Online Article Text |
id | pubmed-9344839 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-93448392022-08-03 Sequence tagging for biomedical extractive question answering Yoon, Wonjin Jackson, Richard Lagerberg, Aron Kang, Jaewoo Bioinformatics Original Papers MOTIVATION: Current studies in extractive question answering (EQA) have modeled the single-span extraction setting, where a single answer span is a label to predict for a given question-passage pair. This setting is natural for general domain EQA as the majority of the questions in the general domain can be answered with a single span. Following general domain EQA models, current biomedical EQA (BioEQA) models utilize the single-span extraction setting with post-processing steps. RESULTS: In this article, we investigate the question distribution across the general and biomedical domains and discover biomedical questions are more likely to require list-type answers (multiple answers) than factoid-type answers (single answer). This necessitates the models capable of producing multiple answers for a question. Based on this preliminary study, we propose a sequence tagging approach for BioEQA, which is a multi-span extraction setting. Our approach directly tackles questions with a variable number of phrases as their answer and can learn to decide the number of answers for a question from training data. Our experimental results on the BioASQ 7b and 8b list-type questions outperformed the best-performing existing models without requiring post-processing steps. AVAILABILITY AND IMPLEMENTATION: Source codes and resources are freely available for download at https://github.com/dmis-lab/SeqTagQA. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2022-06-17 /pmc/articles/PMC9344839/ /pubmed/35713500 http://dx.doi.org/10.1093/bioinformatics/btac397 Text en © The Author(s) 2022. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Original Papers Yoon, Wonjin Jackson, Richard Lagerberg, Aron Kang, Jaewoo Sequence tagging for biomedical extractive question answering |
title | Sequence tagging for biomedical extractive question answering |
title_full | Sequence tagging for biomedical extractive question answering |
title_fullStr | Sequence tagging for biomedical extractive question answering |
title_full_unstemmed | Sequence tagging for biomedical extractive question answering |
title_short | Sequence tagging for biomedical extractive question answering |
title_sort | sequence tagging for biomedical extractive question answering |
topic | Original Papers |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9344839/ https://www.ncbi.nlm.nih.gov/pubmed/35713500 http://dx.doi.org/10.1093/bioinformatics/btac397 |
work_keys_str_mv | AT yoonwonjin sequencetaggingforbiomedicalextractivequestionanswering AT jacksonrichard sequencetaggingforbiomedicalextractivequestionanswering AT lagerbergaron sequencetaggingforbiomedicalextractivequestionanswering AT kangjaewoo sequencetaggingforbiomedicalextractivequestionanswering |