Cargando…
MetaQA: Enhancing human-centered data search using Generative Pre-trained Transformer (GPT) language model and artificial intelligence
Accessing and utilizing geospatial data from various sources is essential for developing scientific research to address complex scientific and societal challenges that require interdisciplinary knowledge. The traditional keyword-based geosearch approach is insufficient due to the uncertainty inheren...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10642800/ https://www.ncbi.nlm.nih.gov/pubmed/37956160 http://dx.doi.org/10.1371/journal.pone.0293034 |
_version_ | 1785147026886885376 |
---|---|
author | Li, Diya Zhang, Zhe |
author_facet | Li, Diya Zhang, Zhe |
author_sort | Li, Diya |
collection | PubMed |
description | Accessing and utilizing geospatial data from various sources is essential for developing scientific research to address complex scientific and societal challenges that require interdisciplinary knowledge. The traditional keyword-based geosearch approach is insufficient due to the uncertainty inherent within spatial information and how it is presented in the data-sharing platform. For instance, the Gulf of Mexico Coastal Ocean Observing System (GCOOS) data search platform stores geoinformation and metadata in a complex tabular. Users can search for data by entering keywords or selecting data from a drop-down manual from the user interface. However, the search results provide limited information about the data product, where detailed descriptions, potential use, and relationship with other data products are still missing. Language models (LMs) have demonstrated great potential in tasks like question answering, sentiment analysis, text classification, and machine translation. However, they struggle when dealing with metadata represented in tabular format. To overcome these challenges, we developed Meta Question Answering System (MetaQA), a novel spatial data search model. MetaQA integrates end-to-end AI models with a generative pre-trained transformer (GPT) to enhance geosearch services. Using GCOOS metadata as a case study, we tested the effectiveness of MetaQA. The results revealed that MetaQA outperforms state-of-the-art question-answering models in handling tabular metadata, underlining its potential for user-inspired geosearch services. |
format | Online Article Text |
id | pubmed-10642800 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-106428002023-11-14 MetaQA: Enhancing human-centered data search using Generative Pre-trained Transformer (GPT) language model and artificial intelligence Li, Diya Zhang, Zhe PLoS One Research Article Accessing and utilizing geospatial data from various sources is essential for developing scientific research to address complex scientific and societal challenges that require interdisciplinary knowledge. The traditional keyword-based geosearch approach is insufficient due to the uncertainty inherent within spatial information and how it is presented in the data-sharing platform. For instance, the Gulf of Mexico Coastal Ocean Observing System (GCOOS) data search platform stores geoinformation and metadata in a complex tabular. Users can search for data by entering keywords or selecting data from a drop-down manual from the user interface. However, the search results provide limited information about the data product, where detailed descriptions, potential use, and relationship with other data products are still missing. Language models (LMs) have demonstrated great potential in tasks like question answering, sentiment analysis, text classification, and machine translation. However, they struggle when dealing with metadata represented in tabular format. To overcome these challenges, we developed Meta Question Answering System (MetaQA), a novel spatial data search model. MetaQA integrates end-to-end AI models with a generative pre-trained transformer (GPT) to enhance geosearch services. Using GCOOS metadata as a case study, we tested the effectiveness of MetaQA. The results revealed that MetaQA outperforms state-of-the-art question-answering models in handling tabular metadata, underlining its potential for user-inspired geosearch services. Public Library of Science 2023-11-13 /pmc/articles/PMC10642800/ /pubmed/37956160 http://dx.doi.org/10.1371/journal.pone.0293034 Text en © 2023 Li, Zhang https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Li, Diya Zhang, Zhe MetaQA: Enhancing human-centered data search using Generative Pre-trained Transformer (GPT) language model and artificial intelligence |
title | MetaQA: Enhancing human-centered data search using Generative Pre-trained Transformer (GPT) language model and artificial intelligence |
title_full | MetaQA: Enhancing human-centered data search using Generative Pre-trained Transformer (GPT) language model and artificial intelligence |
title_fullStr | MetaQA: Enhancing human-centered data search using Generative Pre-trained Transformer (GPT) language model and artificial intelligence |
title_full_unstemmed | MetaQA: Enhancing human-centered data search using Generative Pre-trained Transformer (GPT) language model and artificial intelligence |
title_short | MetaQA: Enhancing human-centered data search using Generative Pre-trained Transformer (GPT) language model and artificial intelligence |
title_sort | metaqa: enhancing human-centered data search using generative pre-trained transformer (gpt) language model and artificial intelligence |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10642800/ https://www.ncbi.nlm.nih.gov/pubmed/37956160 http://dx.doi.org/10.1371/journal.pone.0293034 |
work_keys_str_mv | AT lidiya metaqaenhancinghumancentereddatasearchusinggenerativepretrainedtransformergptlanguagemodelandartificialintelligence AT zhangzhe metaqaenhancinghumancentereddatasearchusinggenerativepretrainedtransformergptlanguagemodelandartificialintelligence |