Cargando…

Feature selection for helpfulness prediction of online product reviews: An empirical study

Online product reviews underpin nearly all e-shopping activities. The high volume of data, as well as various online review quality, puts growing pressure on automated approaches for informative content prioritization. Despite a substantial body of literature on review helpfulness prediction, the ra...

Descripción completa

Detalles Bibliográficos
Autores principales: Du, Jiahua, Rong, Jia, Michalska, Sandra, Wang, Hua, Zhang, Yanchun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6927604/
https://www.ncbi.nlm.nih.gov/pubmed/31869404
http://dx.doi.org/10.1371/journal.pone.0226902
_version_ 1783482327837442048
author Du, Jiahua
Rong, Jia
Michalska, Sandra
Wang, Hua
Zhang, Yanchun
author_facet Du, Jiahua
Rong, Jia
Michalska, Sandra
Wang, Hua
Zhang, Yanchun
author_sort Du, Jiahua
collection PubMed
description Online product reviews underpin nearly all e-shopping activities. The high volume of data, as well as various online review quality, puts growing pressure on automated approaches for informative content prioritization. Despite a substantial body of literature on review helpfulness prediction, the rationale behind specific feature selection is largely under-studied. Also, the current works tend to concentrate on domain- and/or platform-dependent feature curation, lacking wider generalization. Moreover, the issue of result comparability and reproducibility occurs due to frequent data and source code unavailability. This study addresses the gaps through the most comprehensive feature identification, evaluation, and selection. To this end, the 30 most frequently used content-based features are first identified from 149 relevant research papers and grouped into five coherent categories. The features are then selected to perform helpfulness prediction on six domains of the largest publicly available Amazon 5-core dataset. Three scenarios for feature selection are considered: (i) individual features, (ii) features within each category, and (iii) all features. Empirical results demonstrate that semantics plays a dominant role in predicting informative reviews, followed by sentiment, and other features. Finally, feature combination patterns and selection guidelines across domains are summarized to enhance customer experience in today’s prevalent e-commerce environment. The computational framework for helpfulness prediction used in the study have been released to facilitate result comparability and reproducibility.
format Online
Article
Text
id pubmed-6927604
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-69276042020-01-07 Feature selection for helpfulness prediction of online product reviews: An empirical study Du, Jiahua Rong, Jia Michalska, Sandra Wang, Hua Zhang, Yanchun PLoS One Research Article Online product reviews underpin nearly all e-shopping activities. The high volume of data, as well as various online review quality, puts growing pressure on automated approaches for informative content prioritization. Despite a substantial body of literature on review helpfulness prediction, the rationale behind specific feature selection is largely under-studied. Also, the current works tend to concentrate on domain- and/or platform-dependent feature curation, lacking wider generalization. Moreover, the issue of result comparability and reproducibility occurs due to frequent data and source code unavailability. This study addresses the gaps through the most comprehensive feature identification, evaluation, and selection. To this end, the 30 most frequently used content-based features are first identified from 149 relevant research papers and grouped into five coherent categories. The features are then selected to perform helpfulness prediction on six domains of the largest publicly available Amazon 5-core dataset. Three scenarios for feature selection are considered: (i) individual features, (ii) features within each category, and (iii) all features. Empirical results demonstrate that semantics plays a dominant role in predicting informative reviews, followed by sentiment, and other features. Finally, feature combination patterns and selection guidelines across domains are summarized to enhance customer experience in today’s prevalent e-commerce environment. The computational framework for helpfulness prediction used in the study have been released to facilitate result comparability and reproducibility. Public Library of Science 2019-12-23 /pmc/articles/PMC6927604/ /pubmed/31869404 http://dx.doi.org/10.1371/journal.pone.0226902 Text en © 2019 Du et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Du, Jiahua
Rong, Jia
Michalska, Sandra
Wang, Hua
Zhang, Yanchun
Feature selection for helpfulness prediction of online product reviews: An empirical study
title Feature selection for helpfulness prediction of online product reviews: An empirical study
title_full Feature selection for helpfulness prediction of online product reviews: An empirical study
title_fullStr Feature selection for helpfulness prediction of online product reviews: An empirical study
title_full_unstemmed Feature selection for helpfulness prediction of online product reviews: An empirical study
title_short Feature selection for helpfulness prediction of online product reviews: An empirical study
title_sort feature selection for helpfulness prediction of online product reviews: an empirical study
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6927604/
https://www.ncbi.nlm.nih.gov/pubmed/31869404
http://dx.doi.org/10.1371/journal.pone.0226902
work_keys_str_mv AT dujiahua featureselectionforhelpfulnesspredictionofonlineproductreviewsanempiricalstudy
AT rongjia featureselectionforhelpfulnesspredictionofonlineproductreviewsanempiricalstudy
AT michalskasandra featureselectionforhelpfulnesspredictionofonlineproductreviewsanempiricalstudy
AT wanghua featureselectionforhelpfulnesspredictionofonlineproductreviewsanempiricalstudy
AT zhangyanchun featureselectionforhelpfulnesspredictionofonlineproductreviewsanempiricalstudy