Cargando…
PPR-SSM: personalized PageRank and semantic similarity measures for entity linking
BACKGROUND: Biomedical literature concerns a wide range of concepts, requiring controlled vocabularies to maintain a consistent terminology across different research groups. However, as new concepts are introduced, biomedical literature is prone to ambiguity, specifically in fields that are advancin...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6819326/ https://www.ncbi.nlm.nih.gov/pubmed/31664891 http://dx.doi.org/10.1186/s12859-019-3157-y |
_version_ | 1783463702487367680 |
---|---|
author | Lamurias, Andre Ruas, Pedro Couto, Francisco M. |
author_facet | Lamurias, Andre Ruas, Pedro Couto, Francisco M. |
author_sort | Lamurias, Andre |
collection | PubMed |
description | BACKGROUND: Biomedical literature concerns a wide range of concepts, requiring controlled vocabularies to maintain a consistent terminology across different research groups. However, as new concepts are introduced, biomedical literature is prone to ambiguity, specifically in fields that are advancing more rapidly, for example, drug design and development. Entity linking is a text mining task that aims at linking entities mentioned in the literature to concepts in a knowledge base. For example, entity linking can help finding all documents that mention the same concept and improve relation extraction methods. Existing approaches focus on the local similarity of each entity and the global coherence of all entities in a document, but do not take into account the semantics of the domain. RESULTS: We propose a method, PPR-SSM, to link entities found in documents to concepts from domain-specific ontologies. Our method is based on Personalized PageRank (PPR), using the relations of the ontology to generate a graph of candidate concepts for the mentioned entities. We demonstrate how the knowledge encoded in a domain-specific ontology can be used to calculate the coherence of a set of candidate concepts, improving the accuracy of entity linking. Furthermore, we explore weighting the edges between candidate concepts using semantic similarity measures (SSM). We show how PPR-SSM can be used to effectively link named entities to biomedical ontologies, namely chemical compounds, phenotypes, and gene-product localization and processes. CONCLUSIONS: We demonstrated that PPR-SSM outperforms state-of-the-art entity linking methods in four distinct gold standards, by taking advantage of the semantic information contained in ontologies. Moreover, PPR-SSM is a graph-based method that does not require training data. Our method improved the entity linking accuracy of chemical compounds by 0.1385 when compared to a method that does not use SSMs. |
format | Online Article Text |
id | pubmed-6819326 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-68193262019-10-31 PPR-SSM: personalized PageRank and semantic similarity measures for entity linking Lamurias, Andre Ruas, Pedro Couto, Francisco M. BMC Bioinformatics Methodology Article BACKGROUND: Biomedical literature concerns a wide range of concepts, requiring controlled vocabularies to maintain a consistent terminology across different research groups. However, as new concepts are introduced, biomedical literature is prone to ambiguity, specifically in fields that are advancing more rapidly, for example, drug design and development. Entity linking is a text mining task that aims at linking entities mentioned in the literature to concepts in a knowledge base. For example, entity linking can help finding all documents that mention the same concept and improve relation extraction methods. Existing approaches focus on the local similarity of each entity and the global coherence of all entities in a document, but do not take into account the semantics of the domain. RESULTS: We propose a method, PPR-SSM, to link entities found in documents to concepts from domain-specific ontologies. Our method is based on Personalized PageRank (PPR), using the relations of the ontology to generate a graph of candidate concepts for the mentioned entities. We demonstrate how the knowledge encoded in a domain-specific ontology can be used to calculate the coherence of a set of candidate concepts, improving the accuracy of entity linking. Furthermore, we explore weighting the edges between candidate concepts using semantic similarity measures (SSM). We show how PPR-SSM can be used to effectively link named entities to biomedical ontologies, namely chemical compounds, phenotypes, and gene-product localization and processes. CONCLUSIONS: We demonstrated that PPR-SSM outperforms state-of-the-art entity linking methods in four distinct gold standards, by taking advantage of the semantic information contained in ontologies. Moreover, PPR-SSM is a graph-based method that does not require training data. Our method improved the entity linking accuracy of chemical compounds by 0.1385 when compared to a method that does not use SSMs. BioMed Central 2019-10-29 /pmc/articles/PMC6819326/ /pubmed/31664891 http://dx.doi.org/10.1186/s12859-019-3157-y Text en © The Author(s) 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Methodology Article Lamurias, Andre Ruas, Pedro Couto, Francisco M. PPR-SSM: personalized PageRank and semantic similarity measures for entity linking |
title | PPR-SSM: personalized PageRank and semantic similarity measures for entity linking |
title_full | PPR-SSM: personalized PageRank and semantic similarity measures for entity linking |
title_fullStr | PPR-SSM: personalized PageRank and semantic similarity measures for entity linking |
title_full_unstemmed | PPR-SSM: personalized PageRank and semantic similarity measures for entity linking |
title_short | PPR-SSM: personalized PageRank and semantic similarity measures for entity linking |
title_sort | ppr-ssm: personalized pagerank and semantic similarity measures for entity linking |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6819326/ https://www.ncbi.nlm.nih.gov/pubmed/31664891 http://dx.doi.org/10.1186/s12859-019-3157-y |
work_keys_str_mv | AT lamuriasandre pprssmpersonalizedpagerankandsemanticsimilaritymeasuresforentitylinking AT ruaspedro pprssmpersonalizedpagerankandsemanticsimilaritymeasuresforentitylinking AT coutofranciscom pprssmpersonalizedpagerankandsemanticsimilaritymeasuresforentitylinking |