Cargando…

COS: A new MeSH term embedding incorporating corpus, ontology, and semantic predications

The embedding of Medical Subject Headings (MeSH) terms has become a foundation for many downstream bioinformatics tasks. Recent studies employ different data sources, such as the corpus (in which each document is indexed by a set of MeSH terms), the MeSH term ontology, and the semantic predications...

Descripción completa

Detalles Bibliográficos
Autores principales: Ding, Juncheng, Jin, Wei
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8096083/
https://www.ncbi.nlm.nih.gov/pubmed/33945566
http://dx.doi.org/10.1371/journal.pone.0251094
_version_ 1783688092024045568
author Ding, Juncheng
Jin, Wei
author_facet Ding, Juncheng
Jin, Wei
author_sort Ding, Juncheng
collection PubMed
description The embedding of Medical Subject Headings (MeSH) terms has become a foundation for many downstream bioinformatics tasks. Recent studies employ different data sources, such as the corpus (in which each document is indexed by a set of MeSH terms), the MeSH term ontology, and the semantic predications between MeSH terms (extracted by SemMedDB), to learn their embeddings. While these data sources contribute to learning the MeSH term embeddings, current approaches fail to incorporate all of them in the learning process. The challenge is that the structured relationships between MeSH terms are different across the data sources, and there is no approach to fusing such complex data into the MeSH term embedding learning. In this paper, we study the problem of incorporating corpus, ontology, and semantic predications to learn the embeddings of MeSH terms. We propose a novel framework, Corpus, Ontology, and Semantic predications-based MeSH term embedding (COS), to generate high-quality MeSH term embeddings. COS converts the corpus, ontology, and semantic predications into MeSH term sequences, merges these sequences, and learns MeSH term embeddings using the sequences. Extensive experiments on different datasets show that COS outperforms various baseline embeddings and traditional non-embedding-based baselines.
format Online
Article
Text
id pubmed-8096083
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-80960832021-05-17 COS: A new MeSH term embedding incorporating corpus, ontology, and semantic predications Ding, Juncheng Jin, Wei PLoS One Research Article The embedding of Medical Subject Headings (MeSH) terms has become a foundation for many downstream bioinformatics tasks. Recent studies employ different data sources, such as the corpus (in which each document is indexed by a set of MeSH terms), the MeSH term ontology, and the semantic predications between MeSH terms (extracted by SemMedDB), to learn their embeddings. While these data sources contribute to learning the MeSH term embeddings, current approaches fail to incorporate all of them in the learning process. The challenge is that the structured relationships between MeSH terms are different across the data sources, and there is no approach to fusing such complex data into the MeSH term embedding learning. In this paper, we study the problem of incorporating corpus, ontology, and semantic predications to learn the embeddings of MeSH terms. We propose a novel framework, Corpus, Ontology, and Semantic predications-based MeSH term embedding (COS), to generate high-quality MeSH term embeddings. COS converts the corpus, ontology, and semantic predications into MeSH term sequences, merges these sequences, and learns MeSH term embeddings using the sequences. Extensive experiments on different datasets show that COS outperforms various baseline embeddings and traditional non-embedding-based baselines. Public Library of Science 2021-05-04 /pmc/articles/PMC8096083/ /pubmed/33945566 http://dx.doi.org/10.1371/journal.pone.0251094 Text en © 2021 Ding, Jin https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Ding, Juncheng
Jin, Wei
COS: A new MeSH term embedding incorporating corpus, ontology, and semantic predications
title COS: A new MeSH term embedding incorporating corpus, ontology, and semantic predications
title_full COS: A new MeSH term embedding incorporating corpus, ontology, and semantic predications
title_fullStr COS: A new MeSH term embedding incorporating corpus, ontology, and semantic predications
title_full_unstemmed COS: A new MeSH term embedding incorporating corpus, ontology, and semantic predications
title_short COS: A new MeSH term embedding incorporating corpus, ontology, and semantic predications
title_sort cos: a new mesh term embedding incorporating corpus, ontology, and semantic predications
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8096083/
https://www.ncbi.nlm.nih.gov/pubmed/33945566
http://dx.doi.org/10.1371/journal.pone.0251094
work_keys_str_mv AT dingjuncheng cosanewmeshtermembeddingincorporatingcorpusontologyandsemanticpredications
AT jinwei cosanewmeshtermembeddingincorporatingcorpusontologyandsemanticpredications