Cargando…

Retrieval and Ranking of Combining Ontology and Content Attributes for Scientific Document

Traditional mathematical search models retrieve scientific documents only by mathematical expressions and their contexts and do not consider the ontological attributes of scientific documents, which result in gaps between the queries and the retrieval results. To solve this problem, a retrieval and...

Descripción completa

Detalles Bibliográficos
Autores principales: Jiang, Xinyu, Tian, Bingjie, Tian, Xuedong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9223173/
https://www.ncbi.nlm.nih.gov/pubmed/35741531
http://dx.doi.org/10.3390/e24060810
_version_ 1784733061219352576
author Jiang, Xinyu
Tian, Bingjie
Tian, Xuedong
author_facet Jiang, Xinyu
Tian, Bingjie
Tian, Xuedong
author_sort Jiang, Xinyu
collection PubMed
description Traditional mathematical search models retrieve scientific documents only by mathematical expressions and their contexts and do not consider the ontological attributes of scientific documents, which result in gaps between the queries and the retrieval results. To solve this problem, a retrieval and ranking model is constructed that synthesizes the information of mathematical expressions with related texts, and the ontology attributes of scientific documents are extracted to further sort the retrieval results. First, the hesitant fuzzy set of mathematical expressions is constructed by using the characteristics of the hesitant fuzzy set to address the multi-attribute problem of mathematical expression matching; then, the similarity of the mathematical expression context sentence is calculated by using the BiLSTM two-way coding feature, and the retrieval result is obtained by synthesizing the similarity between the mathematical expression and the sentence; finally, considering the ontological attributes of scientific documents, the retrieval results are ranked to obtain the final search results. The MAP_10 value of the mathematical expression retrieval results on the Ntcir-Mathir-Wikipedia-Corpus dataset is 0.815, and the average value of the NDCG@10 of the scientific document ranking results is 0.9; these results prove the effectiveness of the scientific document retrieval and ranking method.
format Online
Article
Text
id pubmed-9223173
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-92231732022-06-24 Retrieval and Ranking of Combining Ontology and Content Attributes for Scientific Document Jiang, Xinyu Tian, Bingjie Tian, Xuedong Entropy (Basel) Article Traditional mathematical search models retrieve scientific documents only by mathematical expressions and their contexts and do not consider the ontological attributes of scientific documents, which result in gaps between the queries and the retrieval results. To solve this problem, a retrieval and ranking model is constructed that synthesizes the information of mathematical expressions with related texts, and the ontology attributes of scientific documents are extracted to further sort the retrieval results. First, the hesitant fuzzy set of mathematical expressions is constructed by using the characteristics of the hesitant fuzzy set to address the multi-attribute problem of mathematical expression matching; then, the similarity of the mathematical expression context sentence is calculated by using the BiLSTM two-way coding feature, and the retrieval result is obtained by synthesizing the similarity between the mathematical expression and the sentence; finally, considering the ontological attributes of scientific documents, the retrieval results are ranked to obtain the final search results. The MAP_10 value of the mathematical expression retrieval results on the Ntcir-Mathir-Wikipedia-Corpus dataset is 0.815, and the average value of the NDCG@10 of the scientific document ranking results is 0.9; these results prove the effectiveness of the scientific document retrieval and ranking method. MDPI 2022-06-10 /pmc/articles/PMC9223173/ /pubmed/35741531 http://dx.doi.org/10.3390/e24060810 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Jiang, Xinyu
Tian, Bingjie
Tian, Xuedong
Retrieval and Ranking of Combining Ontology and Content Attributes for Scientific Document
title Retrieval and Ranking of Combining Ontology and Content Attributes for Scientific Document
title_full Retrieval and Ranking of Combining Ontology and Content Attributes for Scientific Document
title_fullStr Retrieval and Ranking of Combining Ontology and Content Attributes for Scientific Document
title_full_unstemmed Retrieval and Ranking of Combining Ontology and Content Attributes for Scientific Document
title_short Retrieval and Ranking of Combining Ontology and Content Attributes for Scientific Document
title_sort retrieval and ranking of combining ontology and content attributes for scientific document
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9223173/
https://www.ncbi.nlm.nih.gov/pubmed/35741531
http://dx.doi.org/10.3390/e24060810
work_keys_str_mv AT jiangxinyu retrievalandrankingofcombiningontologyandcontentattributesforscientificdocument
AT tianbingjie retrievalandrankingofcombiningontologyandcontentattributesforscientificdocument
AT tianxuedong retrievalandrankingofcombiningontologyandcontentattributesforscientificdocument