Cargando…

A Graph-Based Author Name Disambiguation Method and Analysis via Information Theory

Name ambiguity, due to the fact that many people share an identical name, often deteriorates the performance of information integration, document retrieval and web search. In academic data analysis, author name ambiguity usually decreases the analysis performance. To solve this problem, an author na...

Descripción completa

Detalles Bibliográficos
Autores principales: Ma, Yingying, Wu, Youlong, Lu, Chengqiang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7516896/
https://www.ncbi.nlm.nih.gov/pubmed/33286190
http://dx.doi.org/10.3390/e22040416
_version_ 1783587104062701568
author Ma, Yingying
Wu, Youlong
Lu, Chengqiang
author_facet Ma, Yingying
Wu, Youlong
Lu, Chengqiang
author_sort Ma, Yingying
collection PubMed
description Name ambiguity, due to the fact that many people share an identical name, often deteriorates the performance of information integration, document retrieval and web search. In academic data analysis, author name ambiguity usually decreases the analysis performance. To solve this problem, an author name disambiguation task is designed to divide documents related to an author name reference into several parts and each part is associated with a real-life person. Existing methods usually use either attributes of documents or relationships between documents and co-authors. However, methods of feature extraction using attributes cause inflexibility of models while solutions based on relationship graph network ignore the information contained in the features. In this paper, we propose a novel name disambiguation model based on representation learning which incorporates attributes and relationships. Experiments on a public real dataset demonstrate the effectiveness of our model and experimental results demonstrate that our solution is superior to several state-of-the-art graph-based methods. We also increase the interpretability of our method through information theory and show that the analysis could be helpful for model selection and training progress.
format Online
Article
Text
id pubmed-7516896
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-75168962020-11-09 A Graph-Based Author Name Disambiguation Method and Analysis via Information Theory Ma, Yingying Wu, Youlong Lu, Chengqiang Entropy (Basel) Article Name ambiguity, due to the fact that many people share an identical name, often deteriorates the performance of information integration, document retrieval and web search. In academic data analysis, author name ambiguity usually decreases the analysis performance. To solve this problem, an author name disambiguation task is designed to divide documents related to an author name reference into several parts and each part is associated with a real-life person. Existing methods usually use either attributes of documents or relationships between documents and co-authors. However, methods of feature extraction using attributes cause inflexibility of models while solutions based on relationship graph network ignore the information contained in the features. In this paper, we propose a novel name disambiguation model based on representation learning which incorporates attributes and relationships. Experiments on a public real dataset demonstrate the effectiveness of our model and experimental results demonstrate that our solution is superior to several state-of-the-art graph-based methods. We also increase the interpretability of our method through information theory and show that the analysis could be helpful for model selection and training progress. MDPI 2020-04-07 /pmc/articles/PMC7516896/ /pubmed/33286190 http://dx.doi.org/10.3390/e22040416 Text en © 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Ma, Yingying
Wu, Youlong
Lu, Chengqiang
A Graph-Based Author Name Disambiguation Method and Analysis via Information Theory
title A Graph-Based Author Name Disambiguation Method and Analysis via Information Theory
title_full A Graph-Based Author Name Disambiguation Method and Analysis via Information Theory
title_fullStr A Graph-Based Author Name Disambiguation Method and Analysis via Information Theory
title_full_unstemmed A Graph-Based Author Name Disambiguation Method and Analysis via Information Theory
title_short A Graph-Based Author Name Disambiguation Method and Analysis via Information Theory
title_sort graph-based author name disambiguation method and analysis via information theory
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7516896/
https://www.ncbi.nlm.nih.gov/pubmed/33286190
http://dx.doi.org/10.3390/e22040416
work_keys_str_mv AT mayingying agraphbasedauthornamedisambiguationmethodandanalysisviainformationtheory
AT wuyoulong agraphbasedauthornamedisambiguationmethodandanalysisviainformationtheory
AT luchengqiang agraphbasedauthornamedisambiguationmethodandanalysisviainformationtheory
AT mayingying graphbasedauthornamedisambiguationmethodandanalysisviainformationtheory
AT wuyoulong graphbasedauthornamedisambiguationmethodandanalysisviainformationtheory
AT luchengqiang graphbasedauthornamedisambiguationmethodandanalysisviainformationtheory