Cargando…

A Topic Clustering Approach to Finding Similar Questions from Large Question and Answer Archives

With the blooming of Web 2.0, Community Question Answering (CQA) services such as Yahoo! Answers (http://answers.yahoo.com), WikiAnswer (http://wiki.answers.com), and Baidu Zhidao (http://zhidao.baidu.com), etc., have emerged as alternatives for knowledge and information acquisition. Over time, a la...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Wei-Nan, Liu, Ting, Yang, Yang, Cao, Liujuan, Zhang, Yu, Ji, Rongrong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3942313/
https://www.ncbi.nlm.nih.gov/pubmed/24595052
http://dx.doi.org/10.1371/journal.pone.0071511
_version_ 1782479046207078400
author Zhang, Wei-Nan
Liu, Ting
Yang, Yang
Cao, Liujuan
Zhang, Yu
Ji, Rongrong
author_facet Zhang, Wei-Nan
Liu, Ting
Yang, Yang
Cao, Liujuan
Zhang, Yu
Ji, Rongrong
author_sort Zhang, Wei-Nan
collection PubMed
description With the blooming of Web 2.0, Community Question Answering (CQA) services such as Yahoo! Answers (http://answers.yahoo.com), WikiAnswer (http://wiki.answers.com), and Baidu Zhidao (http://zhidao.baidu.com), etc., have emerged as alternatives for knowledge and information acquisition. Over time, a large number of question and answer (Q&A) pairs with high quality devoted by human intelligence have been accumulated as a comprehensive knowledge base. Unlike the search engines, which return long lists of results, searching in the CQA services can obtain the correct answers to the question queries by automatically finding similar questions that have already been answered by other users. Hence, it greatly improves the efficiency of the online information retrieval. However, given a question query, finding the similar and well-answered questions is a non-trivial task. The main challenge is the word mismatch between question query (query) and candidate question for retrieval (question). To investigate this problem, in this study, we capture the word semantic similarity between query and question by introducing the topic modeling approach. We then propose an unsupervised machine-learning approach to finding similar questions on CQA Q&A archives. The experimental results show that our proposed approach significantly outperforms the state-of-the-art methods.
format Online
Article
Text
id pubmed-3942313
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-39423132014-03-06 A Topic Clustering Approach to Finding Similar Questions from Large Question and Answer Archives Zhang, Wei-Nan Liu, Ting Yang, Yang Cao, Liujuan Zhang, Yu Ji, Rongrong PLoS One Research Article With the blooming of Web 2.0, Community Question Answering (CQA) services such as Yahoo! Answers (http://answers.yahoo.com), WikiAnswer (http://wiki.answers.com), and Baidu Zhidao (http://zhidao.baidu.com), etc., have emerged as alternatives for knowledge and information acquisition. Over time, a large number of question and answer (Q&A) pairs with high quality devoted by human intelligence have been accumulated as a comprehensive knowledge base. Unlike the search engines, which return long lists of results, searching in the CQA services can obtain the correct answers to the question queries by automatically finding similar questions that have already been answered by other users. Hence, it greatly improves the efficiency of the online information retrieval. However, given a question query, finding the similar and well-answered questions is a non-trivial task. The main challenge is the word mismatch between question query (query) and candidate question for retrieval (question). To investigate this problem, in this study, we capture the word semantic similarity between query and question by introducing the topic modeling approach. We then propose an unsupervised machine-learning approach to finding similar questions on CQA Q&A archives. The experimental results show that our proposed approach significantly outperforms the state-of-the-art methods. Public Library of Science 2014-03-04 /pmc/articles/PMC3942313/ /pubmed/24595052 http://dx.doi.org/10.1371/journal.pone.0071511 Text en © 2014 Zhang et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Zhang, Wei-Nan
Liu, Ting
Yang, Yang
Cao, Liujuan
Zhang, Yu
Ji, Rongrong
A Topic Clustering Approach to Finding Similar Questions from Large Question and Answer Archives
title A Topic Clustering Approach to Finding Similar Questions from Large Question and Answer Archives
title_full A Topic Clustering Approach to Finding Similar Questions from Large Question and Answer Archives
title_fullStr A Topic Clustering Approach to Finding Similar Questions from Large Question and Answer Archives
title_full_unstemmed A Topic Clustering Approach to Finding Similar Questions from Large Question and Answer Archives
title_short A Topic Clustering Approach to Finding Similar Questions from Large Question and Answer Archives
title_sort topic clustering approach to finding similar questions from large question and answer archives
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3942313/
https://www.ncbi.nlm.nih.gov/pubmed/24595052
http://dx.doi.org/10.1371/journal.pone.0071511
work_keys_str_mv AT zhangweinan atopicclusteringapproachtofindingsimilarquestionsfromlargequestionandanswerarchives
AT liuting atopicclusteringapproachtofindingsimilarquestionsfromlargequestionandanswerarchives
AT yangyang atopicclusteringapproachtofindingsimilarquestionsfromlargequestionandanswerarchives
AT caoliujuan atopicclusteringapproachtofindingsimilarquestionsfromlargequestionandanswerarchives
AT zhangyu atopicclusteringapproachtofindingsimilarquestionsfromlargequestionandanswerarchives
AT jirongrong atopicclusteringapproachtofindingsimilarquestionsfromlargequestionandanswerarchives
AT zhangweinan topicclusteringapproachtofindingsimilarquestionsfromlargequestionandanswerarchives
AT liuting topicclusteringapproachtofindingsimilarquestionsfromlargequestionandanswerarchives
AT yangyang topicclusteringapproachtofindingsimilarquestionsfromlargequestionandanswerarchives
AT caoliujuan topicclusteringapproachtofindingsimilarquestionsfromlargequestionandanswerarchives
AT zhangyu topicclusteringapproachtofindingsimilarquestionsfromlargequestionandanswerarchives
AT jirongrong topicclusteringapproachtofindingsimilarquestionsfromlargequestionandanswerarchives