Cargando…

PPR-SSM: personalized PageRank and semantic similarity measures for entity linking

BACKGROUND: Biomedical literature concerns a wide range of concepts, requiring controlled vocabularies to maintain a consistent terminology across different research groups. However, as new concepts are introduced, biomedical literature is prone to ambiguity, specifically in fields that are advancin...

Descripción completa

Detalles Bibliográficos
Autores principales: Lamurias, Andre, Ruas, Pedro, Couto, Francisco M.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6819326/
https://www.ncbi.nlm.nih.gov/pubmed/31664891
http://dx.doi.org/10.1186/s12859-019-3157-y
_version_ 1783463702487367680
author Lamurias, Andre
Ruas, Pedro
Couto, Francisco M.
author_facet Lamurias, Andre
Ruas, Pedro
Couto, Francisco M.
author_sort Lamurias, Andre
collection PubMed
description BACKGROUND: Biomedical literature concerns a wide range of concepts, requiring controlled vocabularies to maintain a consistent terminology across different research groups. However, as new concepts are introduced, biomedical literature is prone to ambiguity, specifically in fields that are advancing more rapidly, for example, drug design and development. Entity linking is a text mining task that aims at linking entities mentioned in the literature to concepts in a knowledge base. For example, entity linking can help finding all documents that mention the same concept and improve relation extraction methods. Existing approaches focus on the local similarity of each entity and the global coherence of all entities in a document, but do not take into account the semantics of the domain. RESULTS: We propose a method, PPR-SSM, to link entities found in documents to concepts from domain-specific ontologies. Our method is based on Personalized PageRank (PPR), using the relations of the ontology to generate a graph of candidate concepts for the mentioned entities. We demonstrate how the knowledge encoded in a domain-specific ontology can be used to calculate the coherence of a set of candidate concepts, improving the accuracy of entity linking. Furthermore, we explore weighting the edges between candidate concepts using semantic similarity measures (SSM). We show how PPR-SSM can be used to effectively link named entities to biomedical ontologies, namely chemical compounds, phenotypes, and gene-product localization and processes. CONCLUSIONS: We demonstrated that PPR-SSM outperforms state-of-the-art entity linking methods in four distinct gold standards, by taking advantage of the semantic information contained in ontologies. Moreover, PPR-SSM is a graph-based method that does not require training data. Our method improved the entity linking accuracy of chemical compounds by 0.1385 when compared to a method that does not use SSMs.
format Online
Article
Text
id pubmed-6819326
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-68193262019-10-31 PPR-SSM: personalized PageRank and semantic similarity measures for entity linking Lamurias, Andre Ruas, Pedro Couto, Francisco M. BMC Bioinformatics Methodology Article BACKGROUND: Biomedical literature concerns a wide range of concepts, requiring controlled vocabularies to maintain a consistent terminology across different research groups. However, as new concepts are introduced, biomedical literature is prone to ambiguity, specifically in fields that are advancing more rapidly, for example, drug design and development. Entity linking is a text mining task that aims at linking entities mentioned in the literature to concepts in a knowledge base. For example, entity linking can help finding all documents that mention the same concept and improve relation extraction methods. Existing approaches focus on the local similarity of each entity and the global coherence of all entities in a document, but do not take into account the semantics of the domain. RESULTS: We propose a method, PPR-SSM, to link entities found in documents to concepts from domain-specific ontologies. Our method is based on Personalized PageRank (PPR), using the relations of the ontology to generate a graph of candidate concepts for the mentioned entities. We demonstrate how the knowledge encoded in a domain-specific ontology can be used to calculate the coherence of a set of candidate concepts, improving the accuracy of entity linking. Furthermore, we explore weighting the edges between candidate concepts using semantic similarity measures (SSM). We show how PPR-SSM can be used to effectively link named entities to biomedical ontologies, namely chemical compounds, phenotypes, and gene-product localization and processes. CONCLUSIONS: We demonstrated that PPR-SSM outperforms state-of-the-art entity linking methods in four distinct gold standards, by taking advantage of the semantic information contained in ontologies. Moreover, PPR-SSM is a graph-based method that does not require training data. Our method improved the entity linking accuracy of chemical compounds by 0.1385 when compared to a method that does not use SSMs. BioMed Central 2019-10-29 /pmc/articles/PMC6819326/ /pubmed/31664891 http://dx.doi.org/10.1186/s12859-019-3157-y Text en © The Author(s) 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology Article
Lamurias, Andre
Ruas, Pedro
Couto, Francisco M.
PPR-SSM: personalized PageRank and semantic similarity measures for entity linking
title PPR-SSM: personalized PageRank and semantic similarity measures for entity linking
title_full PPR-SSM: personalized PageRank and semantic similarity measures for entity linking
title_fullStr PPR-SSM: personalized PageRank and semantic similarity measures for entity linking
title_full_unstemmed PPR-SSM: personalized PageRank and semantic similarity measures for entity linking
title_short PPR-SSM: personalized PageRank and semantic similarity measures for entity linking
title_sort ppr-ssm: personalized pagerank and semantic similarity measures for entity linking
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6819326/
https://www.ncbi.nlm.nih.gov/pubmed/31664891
http://dx.doi.org/10.1186/s12859-019-3157-y
work_keys_str_mv AT lamuriasandre pprssmpersonalizedpagerankandsemanticsimilaritymeasuresforentitylinking
AT ruaspedro pprssmpersonalizedpagerankandsemanticsimilaritymeasuresforentitylinking
AT coutofranciscom pprssmpersonalizedpagerankandsemanticsimilaritymeasuresforentitylinking