Cargando…

An experimental study on the performance of collaborative filtering based on user reviews for large-scale datasets

Collaborative filtering (CF) approaches generate user recommendations based on user similarities. These similarities are calculated based on the overall (explicit) user ratings. However, in some domains, such ratings may be sparse or unavailable. User reviews can play a significant role in such case...

Descripción completa

Detalles Bibliográficos
Autores principales:	AL-Ghuribi, Sumaia, Mohd Noah, Shahrul Azman, Mohammed, Mawal
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	PeerJ Inc. 2023
Materias:	Data Mining and Machine Learning
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10495999/ https://www.ncbi.nlm.nih.gov/pubmed/37705634 http://dx.doi.org/10.7717/peerj-cs.1525

_version_	1785105014706929664
author	AL-Ghuribi, Sumaia Mohd Noah, Shahrul Azman Mohammed, Mawal
author_facet	AL-Ghuribi, Sumaia Mohd Noah, Shahrul Azman Mohammed, Mawal
author_sort	AL-Ghuribi, Sumaia
collection	PubMed
description	Collaborative filtering (CF) approaches generate user recommendations based on user similarities. These similarities are calculated based on the overall (explicit) user ratings. However, in some domains, such ratings may be sparse or unavailable. User reviews can play a significant role in such cases, as implicit ratings can be derived from the reviews using sentiment analysis, a natural language processing technique. However, most current studies calculate the implicit ratings by simply aggregating the scores of all sentiment words appearing in reviews and, thus, ignoring the elements of sentiment degrees and aspects of user reviews. This study addresses this issue by calculating the implicit rating differently, leveraging the rich information in user reviews by using both sentiment words and aspect–sentiment word pairs to enhance the CF performance. It proposes four methods to calculate the implicit ratings on large-scale datasets: the first considers the degree of sentiment words, while the second exploits the aspects by extracting aspect-sentiment word pairs to calculate the implicit ratings. The remaining two methods combine explicit ratings with the implicit ratings generated by the first two methods. The generated ratings are then incorporated into different CF rating prediction algorithms to evaluate their effectiveness in enhancing the CF performance. Evaluative experiments of the proposed methods are conducted on two large-scale datasets: Amazon and Yelp. Results of the experiments show that the proposed ratings improved the accuracy of CF rating prediction algorithms and outperformed the explicit ratings in terms of three predictive accuracy metrics.
format	Online Article Text
id	pubmed-10495999
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	PeerJ Inc.
record_format	MEDLINE/PubMed
spelling	pubmed-104959992023-09-13 An experimental study on the performance of collaborative filtering based on user reviews for large-scale datasets AL-Ghuribi, Sumaia Mohd Noah, Shahrul Azman Mohammed, Mawal PeerJ Comput Sci Data Mining and Machine Learning Collaborative filtering (CF) approaches generate user recommendations based on user similarities. These similarities are calculated based on the overall (explicit) user ratings. However, in some domains, such ratings may be sparse or unavailable. User reviews can play a significant role in such cases, as implicit ratings can be derived from the reviews using sentiment analysis, a natural language processing technique. However, most current studies calculate the implicit ratings by simply aggregating the scores of all sentiment words appearing in reviews and, thus, ignoring the elements of sentiment degrees and aspects of user reviews. This study addresses this issue by calculating the implicit rating differently, leveraging the rich information in user reviews by using both sentiment words and aspect–sentiment word pairs to enhance the CF performance. It proposes four methods to calculate the implicit ratings on large-scale datasets: the first considers the degree of sentiment words, while the second exploits the aspects by extracting aspect-sentiment word pairs to calculate the implicit ratings. The remaining two methods combine explicit ratings with the implicit ratings generated by the first two methods. The generated ratings are then incorporated into different CF rating prediction algorithms to evaluate their effectiveness in enhancing the CF performance. Evaluative experiments of the proposed methods are conducted on two large-scale datasets: Amazon and Yelp. Results of the experiments show that the proposed ratings improved the accuracy of CF rating prediction algorithms and outperformed the explicit ratings in terms of three predictive accuracy metrics. PeerJ Inc. 2023-08-25 /pmc/articles/PMC10495999/ /pubmed/37705634 http://dx.doi.org/10.7717/peerj-cs.1525 Text en ©2023 AL-Ghuribi et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Computer Science) and either DOI or URL of the article must be cited.
spellingShingle	Data Mining and Machine Learning AL-Ghuribi, Sumaia Mohd Noah, Shahrul Azman Mohammed, Mawal An experimental study on the performance of collaborative filtering based on user reviews for large-scale datasets
title	An experimental study on the performance of collaborative filtering based on user reviews for large-scale datasets
title_full	An experimental study on the performance of collaborative filtering based on user reviews for large-scale datasets
title_fullStr	An experimental study on the performance of collaborative filtering based on user reviews for large-scale datasets
title_full_unstemmed	An experimental study on the performance of collaborative filtering based on user reviews for large-scale datasets
title_short	An experimental study on the performance of collaborative filtering based on user reviews for large-scale datasets
title_sort	experimental study on the performance of collaborative filtering based on user reviews for large-scale datasets
topic	Data Mining and Machine Learning
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10495999/ https://www.ncbi.nlm.nih.gov/pubmed/37705634 http://dx.doi.org/10.7717/peerj-cs.1525
work_keys_str_mv	AT alghuribisumaia anexperimentalstudyontheperformanceofcollaborativefilteringbasedonuserreviewsforlargescaledatasets AT mohdnoahshahrulazman anexperimentalstudyontheperformanceofcollaborativefilteringbasedonuserreviewsforlargescaledatasets AT mohammedmawal anexperimentalstudyontheperformanceofcollaborativefilteringbasedonuserreviewsforlargescaledatasets AT alghuribisumaia experimentalstudyontheperformanceofcollaborativefilteringbasedonuserreviewsforlargescaledatasets AT mohdnoahshahrulazman experimentalstudyontheperformanceofcollaborativefilteringbasedonuserreviewsforlargescaledatasets AT mohammedmawal experimentalstudyontheperformanceofcollaborativefilteringbasedonuserreviewsforlargescaledatasets

An experimental study on the performance of collaborative filtering based on user reviews for large-scale datasets

Ejemplares similares