Cargando…

TextRank Keyword Extraction Algorithm Using Word Vector Clustering Based on Rough Data-Deduction

When TextRank algorithm based on graph model constructs graph associative edges, the co-occurrence window rules only consider the relationships between local terms. Using the information in the document itself is limited. In order to solve the above problems, an improved TextRank keyword extraction...

Descripción completa

Detalles Bibliográficos
Autores principales:	Zhou, Ning, Shi, Wenqian, Liang, Renyu, Zhong, Na
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Hindawi 2022
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8808205/ https://www.ncbi.nlm.nih.gov/pubmed/35126495 http://dx.doi.org/10.1155/2022/5649994

_version_	1784643836051456000
author	Zhou, Ning Shi, Wenqian Liang, Renyu Zhong, Na
author_facet	Zhou, Ning Shi, Wenqian Liang, Renyu Zhong, Na
author_sort	Zhou, Ning
collection	PubMed
description	When TextRank algorithm based on graph model constructs graph associative edges, the co-occurrence window rules only consider the relationships between local terms. Using the information in the document itself is limited. In order to solve the above problems, an improved TextRank keyword extraction algorithm based on rough data reasoning combined with word vector clustering, RDD-WRank, was proposed. Firstly, the algorithm uses rough data reasoning to mine the association between candidate keywords, expands the search scope, and makes the results more comprehensive. Then, based on Wikipedia online open knowledge base, word embedding technology is used to integrate Word2Vec into the improved algorithm, and the word vector of TextRank lexical graph nodes is clustered to adjust the voting importance of nodes in the cluster. Compared with the traditional TextRank algorithm and the Word2Vec algorithm combined with TextRank, the experimental results show that the improved algorithm has significantly improved the extraction accuracy, which proves that the idea of using rough data reasoning can effectively improve the performance of the algorithm to extract keywords.
format	Online Article Text
id	pubmed-8808205
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	Hindawi
record_format	MEDLINE/PubMed
spelling	pubmed-88082052022-02-03 TextRank Keyword Extraction Algorithm Using Word Vector Clustering Based on Rough Data-Deduction Zhou, Ning Shi, Wenqian Liang, Renyu Zhong, Na Comput Intell Neurosci Research Article When TextRank algorithm based on graph model constructs graph associative edges, the co-occurrence window rules only consider the relationships between local terms. Using the information in the document itself is limited. In order to solve the above problems, an improved TextRank keyword extraction algorithm based on rough data reasoning combined with word vector clustering, RDD-WRank, was proposed. Firstly, the algorithm uses rough data reasoning to mine the association between candidate keywords, expands the search scope, and makes the results more comprehensive. Then, based on Wikipedia online open knowledge base, word embedding technology is used to integrate Word2Vec into the improved algorithm, and the word vector of TextRank lexical graph nodes is clustered to adjust the voting importance of nodes in the cluster. Compared with the traditional TextRank algorithm and the Word2Vec algorithm combined with TextRank, the experimental results show that the improved algorithm has significantly improved the extraction accuracy, which proves that the idea of using rough data reasoning can effectively improve the performance of the algorithm to extract keywords. Hindawi 2022-01-25 /pmc/articles/PMC8808205/ /pubmed/35126495 http://dx.doi.org/10.1155/2022/5649994 Text en Copyright © 2022 Ning Zhou et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Research Article Zhou, Ning Shi, Wenqian Liang, Renyu Zhong, Na TextRank Keyword Extraction Algorithm Using Word Vector Clustering Based on Rough Data-Deduction
title	TextRank Keyword Extraction Algorithm Using Word Vector Clustering Based on Rough Data-Deduction
title_full	TextRank Keyword Extraction Algorithm Using Word Vector Clustering Based on Rough Data-Deduction
title_fullStr	TextRank Keyword Extraction Algorithm Using Word Vector Clustering Based on Rough Data-Deduction
title_full_unstemmed	TextRank Keyword Extraction Algorithm Using Word Vector Clustering Based on Rough Data-Deduction
title_short	TextRank Keyword Extraction Algorithm Using Word Vector Clustering Based on Rough Data-Deduction
title_sort	textrank keyword extraction algorithm using word vector clustering based on rough data-deduction
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8808205/ https://www.ncbi.nlm.nih.gov/pubmed/35126495 http://dx.doi.org/10.1155/2022/5649994
work_keys_str_mv	AT zhouning textrankkeywordextractionalgorithmusingwordvectorclusteringbasedonroughdatadeduction AT shiwenqian textrankkeywordextractionalgorithmusingwordvectorclusteringbasedonroughdatadeduction AT liangrenyu textrankkeywordextractionalgorithmusingwordvectorclusteringbasedonroughdatadeduction AT zhongna textrankkeywordextractionalgorithmusingwordvectorclusteringbasedonroughdatadeduction

TextRank Keyword Extraction Algorithm Using Word Vector Clustering Based on Rough Data-Deduction

Ejemplares similares