Cargando…

HQA-Data: A historical question answer generation dataset from previous multi perspective conversation

This data article contains a quality assurance dataset for training the chatbot and chat analysis model. This dataset focuses on NLP tasks, as a model that serves and delivers a satisfactory response to a user's query. We obtained data from a well- known dataset known as “The Ubuntu Dialogue Co...

Descripción completa

Detalles Bibliográficos
Autores principales:	Hosen, Sabbir, Eva, Jannatul Ferdous, Hasib, Ayman, Saha, Aloke Kumar, Mridha, M.F., Wadud, Anwar Hussen
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Elsevier 2023
Materias:	Data Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10294004/ https://www.ncbi.nlm.nih.gov/pubmed/37383776 http://dx.doi.org/10.1016/j.dib.2023.109245

_version_	1785063106419884032
author	Hosen, Sabbir Eva, Jannatul Ferdous Hasib, Ayman Saha, Aloke Kumar Mridha, M.F. Wadud, Anwar Hussen
author_facet	Hosen, Sabbir Eva, Jannatul Ferdous Hasib, Ayman Saha, Aloke Kumar Mridha, M.F. Wadud, Anwar Hussen
author_sort	Hosen, Sabbir
collection	PubMed
description	This data article contains a quality assurance dataset for training the chatbot and chat analysis model. This dataset focuses on NLP tasks, as a model that serves and delivers a satisfactory response to a user's query. We obtained data from a well- known dataset known as “The Ubuntu Dialogue Corpus” for the purpose of constructing our dataset. Which consists of about one million multi-turn conversations containing around seven million utterances and one hundred million words. We derived a context for each dialogueID from these lengthy Ubuntu Dialogue Corpus conversations. We have generated a number of questions and answers based on these contexts. All of these questions and answers are contained within the context. This dataset includes 9364 contexts, 36,438 question-answer pairs. In addition to academic research, the dataset may be used for activities such as constructing this QA for another language, deep learning, language interpretation, reading comprehension, and open-domain question answering. We present the data in raw format; it has been open sourced and publicly available at https://data.mendeley.com/datasets/p85z3v45xk.
format	Online Article Text
id	pubmed-10294004
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	Elsevier
record_format	MEDLINE/PubMed
spelling	pubmed-102940042023-06-28 HQA-Data: A historical question answer generation dataset from previous multi perspective conversation Hosen, Sabbir Eva, Jannatul Ferdous Hasib, Ayman Saha, Aloke Kumar Mridha, M.F. Wadud, Anwar Hussen Data Brief Data Article This data article contains a quality assurance dataset for training the chatbot and chat analysis model. This dataset focuses on NLP tasks, as a model that serves and delivers a satisfactory response to a user's query. We obtained data from a well- known dataset known as “The Ubuntu Dialogue Corpus” for the purpose of constructing our dataset. Which consists of about one million multi-turn conversations containing around seven million utterances and one hundred million words. We derived a context for each dialogueID from these lengthy Ubuntu Dialogue Corpus conversations. We have generated a number of questions and answers based on these contexts. All of these questions and answers are contained within the context. This dataset includes 9364 contexts, 36,438 question-answer pairs. In addition to academic research, the dataset may be used for activities such as constructing this QA for another language, deep learning, language interpretation, reading comprehension, and open-domain question answering. We present the data in raw format; it has been open sourced and publicly available at https://data.mendeley.com/datasets/p85z3v45xk. Elsevier 2023-05-18 /pmc/articles/PMC10294004/ /pubmed/37383776 http://dx.doi.org/10.1016/j.dib.2023.109245 Text en © 2023 The Author(s) https://creativecommons.org/licenses/by/4.0/This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle	Data Article Hosen, Sabbir Eva, Jannatul Ferdous Hasib, Ayman Saha, Aloke Kumar Mridha, M.F. Wadud, Anwar Hussen HQA-Data: A historical question answer generation dataset from previous multi perspective conversation
title	HQA-Data: A historical question answer generation dataset from previous multi perspective conversation
title_full	HQA-Data: A historical question answer generation dataset from previous multi perspective conversation
title_fullStr	HQA-Data: A historical question answer generation dataset from previous multi perspective conversation
title_full_unstemmed	HQA-Data: A historical question answer generation dataset from previous multi perspective conversation
title_short	HQA-Data: A historical question answer generation dataset from previous multi perspective conversation
title_sort	hqa-data: a historical question answer generation dataset from previous multi perspective conversation
topic	Data Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10294004/ https://www.ncbi.nlm.nih.gov/pubmed/37383776 http://dx.doi.org/10.1016/j.dib.2023.109245
work_keys_str_mv	AT hosensabbir hqadataahistoricalquestionanswergenerationdatasetfrompreviousmultiperspectiveconversation AT evajannatulferdous hqadataahistoricalquestionanswergenerationdatasetfrompreviousmultiperspectiveconversation AT hasibayman hqadataahistoricalquestionanswergenerationdatasetfrompreviousmultiperspectiveconversation AT sahaalokekumar hqadataahistoricalquestionanswergenerationdatasetfrompreviousmultiperspectiveconversation AT mridhamf hqadataahistoricalquestionanswergenerationdatasetfrompreviousmultiperspectiveconversation AT wadudanwarhussen hqadataahistoricalquestionanswergenerationdatasetfrompreviousmultiperspectiveconversation

HQA-Data: A historical question answer generation dataset from previous multi perspective conversation

Ejemplares similares