Cargando…

A framework for ontology-based question answering with application to parasite immunology

BACKGROUND: Large quantities of biomedical data are being produced at a rapid pace for a variety of organisms. With ontologies proliferating, data is increasingly being stored using the RDF data model and queried using RDF based querying languages. While existing systems facilitate the querying in v...

Descripción completa

Detalles Bibliográficos
Autores principales: Asiaee, Amir H., Minning, Todd, Doshi, Prashant, Tarleton, Rick L.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4504081/
https://www.ncbi.nlm.nih.gov/pubmed/26185615
http://dx.doi.org/10.1186/s13326-015-0029-x
_version_ 1782381423854878720
author Asiaee, Amir H.
Minning, Todd
Doshi, Prashant
Tarleton, Rick L.
author_facet Asiaee, Amir H.
Minning, Todd
Doshi, Prashant
Tarleton, Rick L.
author_sort Asiaee, Amir H.
collection PubMed
description BACKGROUND: Large quantities of biomedical data are being produced at a rapid pace for a variety of organisms. With ontologies proliferating, data is increasingly being stored using the RDF data model and queried using RDF based querying languages. While existing systems facilitate the querying in various ways, the scientist must map the question in his or her mind to the interface used by the systems. The field of natural language processing has long investigated the challenges of designing natural language based retrieval systems. Recent efforts seek to bring the ability to pose natural language questions to RDF data querying systems while leveraging the associated ontologies. These analyze the input question and extract triples (subject, relationship, object), if possible, mapping them to RDF triples in the data. However, in the biomedical context, relationships between entities are not always explicit in the question and these are often complex involving many intermediate concepts. RESULTS: We present a new framework, OntoNLQA, for querying RDF data annotated using ontologies which allows posing questions in natural language. OntoNLQA offers five steps in order to answer natural language questions. In comparison to previous systems, OntoNLQA differs in how some of the methods are realized. In particular, it introduces a novel approach for discovering the sophisticated semantic associations that may exist between the key terms of a natural language question, in order to build an intuitive query and retrieve precise answers. We apply this framework to the context of parasite immunology data, leading to a system called AskCuebee that allows parasitologists to pose genomic, proteomic and pathway questions in natural language related to the parasite, Trypanosoma cruzi. We separately evaluate the accuracy of each component of OntoNLQA as implemented in AskCuebee and the accuracy of the whole system. AskCuebee answers 68 % of the questions in a corpus of 125 questions, and 60 % of the questions in a new previously unseen corpus. If we allow simple corrections by the scientists, this proportion increases to 92 %. CONCLUSIONS: We introduce a novel framework for question answering and apply it to parasite immunology data. Evaluations of translating the questions to RDF triple queries by combining machine learning, lexical similarity matching with ontology classes, properties and instances for specificity, and discovering associations between them demonstrate that the approach performs well and improves on previous systems. Subsequently, OntoNLQA offers a viable framework for building question answering systems in other biomedical domains. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13326-015-0029-x) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4504081
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-45040812015-07-17 A framework for ontology-based question answering with application to parasite immunology Asiaee, Amir H. Minning, Todd Doshi, Prashant Tarleton, Rick L. J Biomed Semantics Research BACKGROUND: Large quantities of biomedical data are being produced at a rapid pace for a variety of organisms. With ontologies proliferating, data is increasingly being stored using the RDF data model and queried using RDF based querying languages. While existing systems facilitate the querying in various ways, the scientist must map the question in his or her mind to the interface used by the systems. The field of natural language processing has long investigated the challenges of designing natural language based retrieval systems. Recent efforts seek to bring the ability to pose natural language questions to RDF data querying systems while leveraging the associated ontologies. These analyze the input question and extract triples (subject, relationship, object), if possible, mapping them to RDF triples in the data. However, in the biomedical context, relationships between entities are not always explicit in the question and these are often complex involving many intermediate concepts. RESULTS: We present a new framework, OntoNLQA, for querying RDF data annotated using ontologies which allows posing questions in natural language. OntoNLQA offers five steps in order to answer natural language questions. In comparison to previous systems, OntoNLQA differs in how some of the methods are realized. In particular, it introduces a novel approach for discovering the sophisticated semantic associations that may exist between the key terms of a natural language question, in order to build an intuitive query and retrieve precise answers. We apply this framework to the context of parasite immunology data, leading to a system called AskCuebee that allows parasitologists to pose genomic, proteomic and pathway questions in natural language related to the parasite, Trypanosoma cruzi. We separately evaluate the accuracy of each component of OntoNLQA as implemented in AskCuebee and the accuracy of the whole system. AskCuebee answers 68 % of the questions in a corpus of 125 questions, and 60 % of the questions in a new previously unseen corpus. If we allow simple corrections by the scientists, this proportion increases to 92 %. CONCLUSIONS: We introduce a novel framework for question answering and apply it to parasite immunology data. Evaluations of translating the questions to RDF triple queries by combining machine learning, lexical similarity matching with ontology classes, properties and instances for specificity, and discovering associations between them demonstrate that the approach performs well and improves on previous systems. Subsequently, OntoNLQA offers a viable framework for building question answering systems in other biomedical domains. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13326-015-0029-x) contains supplementary material, which is available to authorized users. BioMed Central 2015-07-17 /pmc/articles/PMC4504081/ /pubmed/26185615 http://dx.doi.org/10.1186/s13326-015-0029-x Text en © Asiaee et al. 2015 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Asiaee, Amir H.
Minning, Todd
Doshi, Prashant
Tarleton, Rick L.
A framework for ontology-based question answering with application to parasite immunology
title A framework for ontology-based question answering with application to parasite immunology
title_full A framework for ontology-based question answering with application to parasite immunology
title_fullStr A framework for ontology-based question answering with application to parasite immunology
title_full_unstemmed A framework for ontology-based question answering with application to parasite immunology
title_short A framework for ontology-based question answering with application to parasite immunology
title_sort framework for ontology-based question answering with application to parasite immunology
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4504081/
https://www.ncbi.nlm.nih.gov/pubmed/26185615
http://dx.doi.org/10.1186/s13326-015-0029-x
work_keys_str_mv AT asiaeeamirh aframeworkforontologybasedquestionansweringwithapplicationtoparasiteimmunology
AT minningtodd aframeworkforontologybasedquestionansweringwithapplicationtoparasiteimmunology
AT doshiprashant aframeworkforontologybasedquestionansweringwithapplicationtoparasiteimmunology
AT tarletonrickl aframeworkforontologybasedquestionansweringwithapplicationtoparasiteimmunology
AT asiaeeamirh frameworkforontologybasedquestionansweringwithapplicationtoparasiteimmunology
AT minningtodd frameworkforontologybasedquestionansweringwithapplicationtoparasiteimmunology
AT doshiprashant frameworkforontologybasedquestionansweringwithapplicationtoparasiteimmunology
AT tarletonrickl frameworkforontologybasedquestionansweringwithapplicationtoparasiteimmunology