Cargando…
Retrieval and Ranking of Combining Ontology and Content Attributes for Scientific Document
Traditional mathematical search models retrieve scientific documents only by mathematical expressions and their contexts and do not consider the ontological attributes of scientific documents, which result in gaps between the queries and the retrieval results. To solve this problem, a retrieval and...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9223173/ https://www.ncbi.nlm.nih.gov/pubmed/35741531 http://dx.doi.org/10.3390/e24060810 |
_version_ | 1784733061219352576 |
---|---|
author | Jiang, Xinyu Tian, Bingjie Tian, Xuedong |
author_facet | Jiang, Xinyu Tian, Bingjie Tian, Xuedong |
author_sort | Jiang, Xinyu |
collection | PubMed |
description | Traditional mathematical search models retrieve scientific documents only by mathematical expressions and their contexts and do not consider the ontological attributes of scientific documents, which result in gaps between the queries and the retrieval results. To solve this problem, a retrieval and ranking model is constructed that synthesizes the information of mathematical expressions with related texts, and the ontology attributes of scientific documents are extracted to further sort the retrieval results. First, the hesitant fuzzy set of mathematical expressions is constructed by using the characteristics of the hesitant fuzzy set to address the multi-attribute problem of mathematical expression matching; then, the similarity of the mathematical expression context sentence is calculated by using the BiLSTM two-way coding feature, and the retrieval result is obtained by synthesizing the similarity between the mathematical expression and the sentence; finally, considering the ontological attributes of scientific documents, the retrieval results are ranked to obtain the final search results. The MAP_10 value of the mathematical expression retrieval results on the Ntcir-Mathir-Wikipedia-Corpus dataset is 0.815, and the average value of the NDCG@10 of the scientific document ranking results is 0.9; these results prove the effectiveness of the scientific document retrieval and ranking method. |
format | Online Article Text |
id | pubmed-9223173 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-92231732022-06-24 Retrieval and Ranking of Combining Ontology and Content Attributes for Scientific Document Jiang, Xinyu Tian, Bingjie Tian, Xuedong Entropy (Basel) Article Traditional mathematical search models retrieve scientific documents only by mathematical expressions and their contexts and do not consider the ontological attributes of scientific documents, which result in gaps between the queries and the retrieval results. To solve this problem, a retrieval and ranking model is constructed that synthesizes the information of mathematical expressions with related texts, and the ontology attributes of scientific documents are extracted to further sort the retrieval results. First, the hesitant fuzzy set of mathematical expressions is constructed by using the characteristics of the hesitant fuzzy set to address the multi-attribute problem of mathematical expression matching; then, the similarity of the mathematical expression context sentence is calculated by using the BiLSTM two-way coding feature, and the retrieval result is obtained by synthesizing the similarity between the mathematical expression and the sentence; finally, considering the ontological attributes of scientific documents, the retrieval results are ranked to obtain the final search results. The MAP_10 value of the mathematical expression retrieval results on the Ntcir-Mathir-Wikipedia-Corpus dataset is 0.815, and the average value of the NDCG@10 of the scientific document ranking results is 0.9; these results prove the effectiveness of the scientific document retrieval and ranking method. MDPI 2022-06-10 /pmc/articles/PMC9223173/ /pubmed/35741531 http://dx.doi.org/10.3390/e24060810 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Jiang, Xinyu Tian, Bingjie Tian, Xuedong Retrieval and Ranking of Combining Ontology and Content Attributes for Scientific Document |
title | Retrieval and Ranking of Combining Ontology and Content Attributes for Scientific Document |
title_full | Retrieval and Ranking of Combining Ontology and Content Attributes for Scientific Document |
title_fullStr | Retrieval and Ranking of Combining Ontology and Content Attributes for Scientific Document |
title_full_unstemmed | Retrieval and Ranking of Combining Ontology and Content Attributes for Scientific Document |
title_short | Retrieval and Ranking of Combining Ontology and Content Attributes for Scientific Document |
title_sort | retrieval and ranking of combining ontology and content attributes for scientific document |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9223173/ https://www.ncbi.nlm.nih.gov/pubmed/35741531 http://dx.doi.org/10.3390/e24060810 |
work_keys_str_mv | AT jiangxinyu retrievalandrankingofcombiningontologyandcontentattributesforscientificdocument AT tianbingjie retrievalandrankingofcombiningontologyandcontentattributesforscientificdocument AT tianxuedong retrievalandrankingofcombiningontologyandcontentattributesforscientificdocument |