Cargando…

MetaQA: Enhancing human-centered data search using Generative Pre-trained Transformer (GPT) language model and artificial intelligence

Accessing and utilizing geospatial data from various sources is essential for developing scientific research to address complex scientific and societal challenges that require interdisciplinary knowledge. The traditional keyword-based geosearch approach is insufficient due to the uncertainty inheren...

Descripción completa

Detalles Bibliográficos
Autores principales: Li, Diya, Zhang, Zhe
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10642800/
https://www.ncbi.nlm.nih.gov/pubmed/37956160
http://dx.doi.org/10.1371/journal.pone.0293034
_version_ 1785147026886885376
author Li, Diya
Zhang, Zhe
author_facet Li, Diya
Zhang, Zhe
author_sort Li, Diya
collection PubMed
description Accessing and utilizing geospatial data from various sources is essential for developing scientific research to address complex scientific and societal challenges that require interdisciplinary knowledge. The traditional keyword-based geosearch approach is insufficient due to the uncertainty inherent within spatial information and how it is presented in the data-sharing platform. For instance, the Gulf of Mexico Coastal Ocean Observing System (GCOOS) data search platform stores geoinformation and metadata in a complex tabular. Users can search for data by entering keywords or selecting data from a drop-down manual from the user interface. However, the search results provide limited information about the data product, where detailed descriptions, potential use, and relationship with other data products are still missing. Language models (LMs) have demonstrated great potential in tasks like question answering, sentiment analysis, text classification, and machine translation. However, they struggle when dealing with metadata represented in tabular format. To overcome these challenges, we developed Meta Question Answering System (MetaQA), a novel spatial data search model. MetaQA integrates end-to-end AI models with a generative pre-trained transformer (GPT) to enhance geosearch services. Using GCOOS metadata as a case study, we tested the effectiveness of MetaQA. The results revealed that MetaQA outperforms state-of-the-art question-answering models in handling tabular metadata, underlining its potential for user-inspired geosearch services.
format Online
Article
Text
id pubmed-10642800
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-106428002023-11-14 MetaQA: Enhancing human-centered data search using Generative Pre-trained Transformer (GPT) language model and artificial intelligence Li, Diya Zhang, Zhe PLoS One Research Article Accessing and utilizing geospatial data from various sources is essential for developing scientific research to address complex scientific and societal challenges that require interdisciplinary knowledge. The traditional keyword-based geosearch approach is insufficient due to the uncertainty inherent within spatial information and how it is presented in the data-sharing platform. For instance, the Gulf of Mexico Coastal Ocean Observing System (GCOOS) data search platform stores geoinformation and metadata in a complex tabular. Users can search for data by entering keywords or selecting data from a drop-down manual from the user interface. However, the search results provide limited information about the data product, where detailed descriptions, potential use, and relationship with other data products are still missing. Language models (LMs) have demonstrated great potential in tasks like question answering, sentiment analysis, text classification, and machine translation. However, they struggle when dealing with metadata represented in tabular format. To overcome these challenges, we developed Meta Question Answering System (MetaQA), a novel spatial data search model. MetaQA integrates end-to-end AI models with a generative pre-trained transformer (GPT) to enhance geosearch services. Using GCOOS metadata as a case study, we tested the effectiveness of MetaQA. The results revealed that MetaQA outperforms state-of-the-art question-answering models in handling tabular metadata, underlining its potential for user-inspired geosearch services. Public Library of Science 2023-11-13 /pmc/articles/PMC10642800/ /pubmed/37956160 http://dx.doi.org/10.1371/journal.pone.0293034 Text en © 2023 Li, Zhang https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Li, Diya
Zhang, Zhe
MetaQA: Enhancing human-centered data search using Generative Pre-trained Transformer (GPT) language model and artificial intelligence
title MetaQA: Enhancing human-centered data search using Generative Pre-trained Transformer (GPT) language model and artificial intelligence
title_full MetaQA: Enhancing human-centered data search using Generative Pre-trained Transformer (GPT) language model and artificial intelligence
title_fullStr MetaQA: Enhancing human-centered data search using Generative Pre-trained Transformer (GPT) language model and artificial intelligence
title_full_unstemmed MetaQA: Enhancing human-centered data search using Generative Pre-trained Transformer (GPT) language model and artificial intelligence
title_short MetaQA: Enhancing human-centered data search using Generative Pre-trained Transformer (GPT) language model and artificial intelligence
title_sort metaqa: enhancing human-centered data search using generative pre-trained transformer (gpt) language model and artificial intelligence
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10642800/
https://www.ncbi.nlm.nih.gov/pubmed/37956160
http://dx.doi.org/10.1371/journal.pone.0293034
work_keys_str_mv AT lidiya metaqaenhancinghumancentereddatasearchusinggenerativepretrainedtransformergptlanguagemodelandartificialintelligence
AT zhangzhe metaqaenhancinghumancentereddatasearchusinggenerativepretrainedtransformergptlanguagemodelandartificialintelligence