Cargando…
A Topic Clustering Approach to Finding Similar Questions from Large Question and Answer Archives
With the blooming of Web 2.0, Community Question Answering (CQA) services such as Yahoo! Answers (http://answers.yahoo.com), WikiAnswer (http://wiki.answers.com), and Baidu Zhidao (http://zhidao.baidu.com), etc., have emerged as alternatives for knowledge and information acquisition. Over time, a la...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3942313/ https://www.ncbi.nlm.nih.gov/pubmed/24595052 http://dx.doi.org/10.1371/journal.pone.0071511 |
_version_ | 1782479046207078400 |
---|---|
author | Zhang, Wei-Nan Liu, Ting Yang, Yang Cao, Liujuan Zhang, Yu Ji, Rongrong |
author_facet | Zhang, Wei-Nan Liu, Ting Yang, Yang Cao, Liujuan Zhang, Yu Ji, Rongrong |
author_sort | Zhang, Wei-Nan |
collection | PubMed |
description | With the blooming of Web 2.0, Community Question Answering (CQA) services such as Yahoo! Answers (http://answers.yahoo.com), WikiAnswer (http://wiki.answers.com), and Baidu Zhidao (http://zhidao.baidu.com), etc., have emerged as alternatives for knowledge and information acquisition. Over time, a large number of question and answer (Q&A) pairs with high quality devoted by human intelligence have been accumulated as a comprehensive knowledge base. Unlike the search engines, which return long lists of results, searching in the CQA services can obtain the correct answers to the question queries by automatically finding similar questions that have already been answered by other users. Hence, it greatly improves the efficiency of the online information retrieval. However, given a question query, finding the similar and well-answered questions is a non-trivial task. The main challenge is the word mismatch between question query (query) and candidate question for retrieval (question). To investigate this problem, in this study, we capture the word semantic similarity between query and question by introducing the topic modeling approach. We then propose an unsupervised machine-learning approach to finding similar questions on CQA Q&A archives. The experimental results show that our proposed approach significantly outperforms the state-of-the-art methods. |
format | Online Article Text |
id | pubmed-3942313 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-39423132014-03-06 A Topic Clustering Approach to Finding Similar Questions from Large Question and Answer Archives Zhang, Wei-Nan Liu, Ting Yang, Yang Cao, Liujuan Zhang, Yu Ji, Rongrong PLoS One Research Article With the blooming of Web 2.0, Community Question Answering (CQA) services such as Yahoo! Answers (http://answers.yahoo.com), WikiAnswer (http://wiki.answers.com), and Baidu Zhidao (http://zhidao.baidu.com), etc., have emerged as alternatives for knowledge and information acquisition. Over time, a large number of question and answer (Q&A) pairs with high quality devoted by human intelligence have been accumulated as a comprehensive knowledge base. Unlike the search engines, which return long lists of results, searching in the CQA services can obtain the correct answers to the question queries by automatically finding similar questions that have already been answered by other users. Hence, it greatly improves the efficiency of the online information retrieval. However, given a question query, finding the similar and well-answered questions is a non-trivial task. The main challenge is the word mismatch between question query (query) and candidate question for retrieval (question). To investigate this problem, in this study, we capture the word semantic similarity between query and question by introducing the topic modeling approach. We then propose an unsupervised machine-learning approach to finding similar questions on CQA Q&A archives. The experimental results show that our proposed approach significantly outperforms the state-of-the-art methods. Public Library of Science 2014-03-04 /pmc/articles/PMC3942313/ /pubmed/24595052 http://dx.doi.org/10.1371/journal.pone.0071511 Text en © 2014 Zhang et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited. |
spellingShingle | Research Article Zhang, Wei-Nan Liu, Ting Yang, Yang Cao, Liujuan Zhang, Yu Ji, Rongrong A Topic Clustering Approach to Finding Similar Questions from Large Question and Answer Archives |
title | A Topic Clustering Approach to Finding Similar Questions from Large Question and Answer Archives |
title_full | A Topic Clustering Approach to Finding Similar Questions from Large Question and Answer Archives |
title_fullStr | A Topic Clustering Approach to Finding Similar Questions from Large Question and Answer Archives |
title_full_unstemmed | A Topic Clustering Approach to Finding Similar Questions from Large Question and Answer Archives |
title_short | A Topic Clustering Approach to Finding Similar Questions from Large Question and Answer Archives |
title_sort | topic clustering approach to finding similar questions from large question and answer archives |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3942313/ https://www.ncbi.nlm.nih.gov/pubmed/24595052 http://dx.doi.org/10.1371/journal.pone.0071511 |
work_keys_str_mv | AT zhangweinan atopicclusteringapproachtofindingsimilarquestionsfromlargequestionandanswerarchives AT liuting atopicclusteringapproachtofindingsimilarquestionsfromlargequestionandanswerarchives AT yangyang atopicclusteringapproachtofindingsimilarquestionsfromlargequestionandanswerarchives AT caoliujuan atopicclusteringapproachtofindingsimilarquestionsfromlargequestionandanswerarchives AT zhangyu atopicclusteringapproachtofindingsimilarquestionsfromlargequestionandanswerarchives AT jirongrong atopicclusteringapproachtofindingsimilarquestionsfromlargequestionandanswerarchives AT zhangweinan topicclusteringapproachtofindingsimilarquestionsfromlargequestionandanswerarchives AT liuting topicclusteringapproachtofindingsimilarquestionsfromlargequestionandanswerarchives AT yangyang topicclusteringapproachtofindingsimilarquestionsfromlargequestionandanswerarchives AT caoliujuan topicclusteringapproachtofindingsimilarquestionsfromlargequestionandanswerarchives AT zhangyu topicclusteringapproachtofindingsimilarquestionsfromlargequestionandanswerarchives AT jirongrong topicclusteringapproachtofindingsimilarquestionsfromlargequestionandanswerarchives |