Cargando…
Application and evaluation of knowledge graph embeddings in biomedical data
Linked data and bio-ontologies enabling knowledge representation, standardization, and dissemination are an integral part of developing biological and biomedical databases. That is, linked data and bio-ontologies are employed in databases to maintain data integrity, data organization, and to empower...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
PeerJ Inc.
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7959619/ https://www.ncbi.nlm.nih.gov/pubmed/33816992 http://dx.doi.org/10.7717/peerj-cs.341 |
_version_ | 1783664988882206720 |
---|---|
author | Alshahrani, Mona Thafar, Maha A. Essack, Magbubah |
author_facet | Alshahrani, Mona Thafar, Maha A. Essack, Magbubah |
author_sort | Alshahrani, Mona |
collection | PubMed |
description | Linked data and bio-ontologies enabling knowledge representation, standardization, and dissemination are an integral part of developing biological and biomedical databases. That is, linked data and bio-ontologies are employed in databases to maintain data integrity, data organization, and to empower search capabilities. However, linked data and bio-ontologies are more recently being used to represent information as multi-relational heterogeneous graphs, “knowledge graphs”. The reason being, entities and relations in the knowledge graph can be represented as embedding vectors in semantic space, and these embedding vectors have been used to predict relationships between entities. Such knowledge graph embedding methods provide a practical approach to data analytics and increase chances of building machine learning models with high prediction accuracy that can enhance decision support systems. Here, we present a comparative assessment and a standard benchmark for knowledge graph-based representation learning methods focused on the link prediction task for biological relations. We systematically investigated and compared state-of-the-art embedding methods based on the design settings used for training and evaluation. We further tested various strategies aimed at controlling the amount of information related to each relation in the knowledge graph and its effects on the final performance. We also assessed the quality of the knowledge graph features through clustering and visualization and employed several evaluation metrics to examine their uses and differences. Based on this systematic comparison and assessments, we identify and discuss the limitations of knowledge graph-based representation learning methods and suggest some guidelines for the development of more improved methods. |
format | Online Article Text |
id | pubmed-7959619 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | PeerJ Inc. |
record_format | MEDLINE/PubMed |
spelling | pubmed-79596192021-04-02 Application and evaluation of knowledge graph embeddings in biomedical data Alshahrani, Mona Thafar, Maha A. Essack, Magbubah PeerJ Comput Sci Bioinformatics Linked data and bio-ontologies enabling knowledge representation, standardization, and dissemination are an integral part of developing biological and biomedical databases. That is, linked data and bio-ontologies are employed in databases to maintain data integrity, data organization, and to empower search capabilities. However, linked data and bio-ontologies are more recently being used to represent information as multi-relational heterogeneous graphs, “knowledge graphs”. The reason being, entities and relations in the knowledge graph can be represented as embedding vectors in semantic space, and these embedding vectors have been used to predict relationships between entities. Such knowledge graph embedding methods provide a practical approach to data analytics and increase chances of building machine learning models with high prediction accuracy that can enhance decision support systems. Here, we present a comparative assessment and a standard benchmark for knowledge graph-based representation learning methods focused on the link prediction task for biological relations. We systematically investigated and compared state-of-the-art embedding methods based on the design settings used for training and evaluation. We further tested various strategies aimed at controlling the amount of information related to each relation in the knowledge graph and its effects on the final performance. We also assessed the quality of the knowledge graph features through clustering and visualization and employed several evaluation metrics to examine their uses and differences. Based on this systematic comparison and assessments, we identify and discuss the limitations of knowledge graph-based representation learning methods and suggest some guidelines for the development of more improved methods. PeerJ Inc. 2021-02-18 /pmc/articles/PMC7959619/ /pubmed/33816992 http://dx.doi.org/10.7717/peerj-cs.341 Text en © 2021 Alshahrani et al. https://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Computer Science) and either DOI or URL of the article must be cited. |
spellingShingle | Bioinformatics Alshahrani, Mona Thafar, Maha A. Essack, Magbubah Application and evaluation of knowledge graph embeddings in biomedical data |
title | Application and evaluation of knowledge graph embeddings in biomedical data |
title_full | Application and evaluation of knowledge graph embeddings in biomedical data |
title_fullStr | Application and evaluation of knowledge graph embeddings in biomedical data |
title_full_unstemmed | Application and evaluation of knowledge graph embeddings in biomedical data |
title_short | Application and evaluation of knowledge graph embeddings in biomedical data |
title_sort | application and evaluation of knowledge graph embeddings in biomedical data |
topic | Bioinformatics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7959619/ https://www.ncbi.nlm.nih.gov/pubmed/33816992 http://dx.doi.org/10.7717/peerj-cs.341 |
work_keys_str_mv | AT alshahranimona applicationandevaluationofknowledgegraphembeddingsinbiomedicaldata AT thafarmahaa applicationandevaluationofknowledgegraphembeddingsinbiomedicaldata AT essackmagbubah applicationandevaluationofknowledgegraphembeddingsinbiomedicaldata |