Cargando…
Bio-AnswerFinder: a system to find answers to questions from biomedical texts
The ever accelerating pace of biomedical research results in corresponding acceleration in the volume of biomedical literature created. Since new research builds upon existing knowledge, the rate of increase in the available knowledge encoded in biomedical literature makes the easy access to that im...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7053013/ https://www.ncbi.nlm.nih.gov/pubmed/31925435 http://dx.doi.org/10.1093/database/baz137 |
_version_ | 1783502959107112960 |
---|---|
author | Ozyurt, Ibrahim Burak Bandrowski, Anita Grethe, Jeffrey S |
author_facet | Ozyurt, Ibrahim Burak Bandrowski, Anita Grethe, Jeffrey S |
author_sort | Ozyurt, Ibrahim Burak |
collection | PubMed |
description | The ever accelerating pace of biomedical research results in corresponding acceleration in the volume of biomedical literature created. Since new research builds upon existing knowledge, the rate of increase in the available knowledge encoded in biomedical literature makes the easy access to that implicit knowledge more vital over time. Toward the goal of making implicit knowledge in the biomedical literature easily accessible to biomedical researchers, we introduce a question answering system called Bio-AnswerFinder. Bio-AnswerFinder uses a weighted-relaxed word mover's distance based similarity on word/phrase embeddings learned from PubMed abstracts to rank answers after question focus entity type filtering. Our approach retrieves relevant documents iteratively via enhanced keyword queries from a traditional search engine. To improve document retrieval performance, we introduced a supervised long short term memory neural network to select keywords from the question to facilitate iterative keyword search. Our unsupervised baseline system achieves a mean reciprocal rank score of 0.46 and Precision@1 of 0.32 on 936 questions from BioASQ. The answer sentences are further ranked by a fine-tuned bidirectional encoder representation from transformers (BERT) classifier trained using 100 answer candidate sentences per question for 492 BioASQ questions. To test ranking performance, we report a blind test on 100 questions that three independent annotators scored. These experts preferred BERT based reranking with 7% improvement on MRR and 13% improvement on Precision@1 scores on average. |
format | Online Article Text |
id | pubmed-7053013 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-70530132020-03-09 Bio-AnswerFinder: a system to find answers to questions from biomedical texts Ozyurt, Ibrahim Burak Bandrowski, Anita Grethe, Jeffrey S Database (Oxford) Original Article The ever accelerating pace of biomedical research results in corresponding acceleration in the volume of biomedical literature created. Since new research builds upon existing knowledge, the rate of increase in the available knowledge encoded in biomedical literature makes the easy access to that implicit knowledge more vital over time. Toward the goal of making implicit knowledge in the biomedical literature easily accessible to biomedical researchers, we introduce a question answering system called Bio-AnswerFinder. Bio-AnswerFinder uses a weighted-relaxed word mover's distance based similarity on word/phrase embeddings learned from PubMed abstracts to rank answers after question focus entity type filtering. Our approach retrieves relevant documents iteratively via enhanced keyword queries from a traditional search engine. To improve document retrieval performance, we introduced a supervised long short term memory neural network to select keywords from the question to facilitate iterative keyword search. Our unsupervised baseline system achieves a mean reciprocal rank score of 0.46 and Precision@1 of 0.32 on 936 questions from BioASQ. The answer sentences are further ranked by a fine-tuned bidirectional encoder representation from transformers (BERT) classifier trained using 100 answer candidate sentences per question for 492 BioASQ questions. To test ranking performance, we report a blind test on 100 questions that three independent annotators scored. These experts preferred BERT based reranking with 7% improvement on MRR and 13% improvement on Precision@1 scores on average. Oxford University Press 2020-01-10 /pmc/articles/PMC7053013/ /pubmed/31925435 http://dx.doi.org/10.1093/database/baz137 Text en © The Author(s) 2020. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Original Article Ozyurt, Ibrahim Burak Bandrowski, Anita Grethe, Jeffrey S Bio-AnswerFinder: a system to find answers to questions from biomedical texts |
title | Bio-AnswerFinder: a system to find answers to questions from biomedical texts |
title_full | Bio-AnswerFinder: a system to find answers to questions from biomedical texts |
title_fullStr | Bio-AnswerFinder: a system to find answers to questions from biomedical texts |
title_full_unstemmed | Bio-AnswerFinder: a system to find answers to questions from biomedical texts |
title_short | Bio-AnswerFinder: a system to find answers to questions from biomedical texts |
title_sort | bio-answerfinder: a system to find answers to questions from biomedical texts |
topic | Original Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7053013/ https://www.ncbi.nlm.nih.gov/pubmed/31925435 http://dx.doi.org/10.1093/database/baz137 |
work_keys_str_mv | AT ozyurtibrahimburak bioanswerfinderasystemtofindanswerstoquestionsfrombiomedicaltexts AT bandrowskianita bioanswerfinderasystemtofindanswerstoquestionsfrombiomedicaltexts AT grethejeffreys bioanswerfinderasystemtofindanswerstoquestionsfrombiomedicaltexts |