Cargando…

Heterogeneous deep graph convolutional network with citation relational BERT for COVID-19 inline citation recommendation

The outbreak of COVID-19 brings almost the biggest explosions of scientific literature ever. Facing such volume literature, it is hard for researches to find desired citation when carrying out COVID-19 related research, especially for junior researchers. This paper presents a novel neural network ba...

Descripción completa

Detalles Bibliográficos
Autores principales: Dai, Tao, Zhao, Jie, Li, Dehong, Tian, Shun, Zhao, Xiangmo, Pan, Shirui
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier Ltd. 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9482209/
https://www.ncbi.nlm.nih.gov/pubmed/36157791
http://dx.doi.org/10.1016/j.eswa.2022.118841
_version_ 1784791403365138432
author Dai, Tao
Zhao, Jie
Li, Dehong
Tian, Shun
Zhao, Xiangmo
Pan, Shirui
author_facet Dai, Tao
Zhao, Jie
Li, Dehong
Tian, Shun
Zhao, Xiangmo
Pan, Shirui
author_sort Dai, Tao
collection PubMed
description The outbreak of COVID-19 brings almost the biggest explosions of scientific literature ever. Facing such volume literature, it is hard for researches to find desired citation when carrying out COVID-19 related research, especially for junior researchers. This paper presents a novel neural network based method, called citation relational BERT with heterogeneous deep graph convolutional network (CRB-HDGCN), for COVID-19 inline citation recommendation task. The CRB-HDGCN contains two main stages. The first stage is to enhance the representation learning of BERT model for COVID-19 inline citation recommendation task through CRB. To achieve the above goal, an augmented citation sentence corpus, which replaces the citation placeholder with the title of the cited papers, is used to lightly retrain BERT model. In addition, we extract three types of sentence pair according citation relation, and establish sentence prediction tasks to further fine-tune the BERT model. The second stage is to learn effective dense vector of nodes among COVID-19 bibliographic graph through HDGCN. The HDGCN contains four layers which are essentially all sub neural networks. The first layer is initial embedding layer which generates initial input vectors with fixed size through CRB and a multilayer perceptron. The second layer is a heterogeneous graph convolutional layer. In this layer, we expand traditional homogeneous graph convolutional network into heterogeneous by subtly adding heterogeneous nodes and relations. The third layer is a deep attention layer. This layer uses trainable project vectors to reweight the node importance simultaneously according to both node types and convolution layers, which further promotes the performance of learnt node vectors. The last decoder layer recovers the graph structure and let the whole network trainable. The recommendation is finally achieved by integrating the high performance heterogeneous vectors learnt from CRB-HDGCN with the query vectors. We conduct experiments on the CORD-19 and LitCovid datasets. The results show that compared with the second best method CO-Search, CRB-HDGCN improves MAP, MRR, P@100 and R@100 with 21.8%, 22.7%, 37.6% and 21.2% on CORD-19, and 29.1%, 25.9%, 15.3% and 11.3% on LitCovid, respectively.
format Online
Article
Text
id pubmed-9482209
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Elsevier Ltd.
record_format MEDLINE/PubMed
spelling pubmed-94822092022-09-19 Heterogeneous deep graph convolutional network with citation relational BERT for COVID-19 inline citation recommendation Dai, Tao Zhao, Jie Li, Dehong Tian, Shun Zhao, Xiangmo Pan, Shirui Expert Syst Appl Article The outbreak of COVID-19 brings almost the biggest explosions of scientific literature ever. Facing such volume literature, it is hard for researches to find desired citation when carrying out COVID-19 related research, especially for junior researchers. This paper presents a novel neural network based method, called citation relational BERT with heterogeneous deep graph convolutional network (CRB-HDGCN), for COVID-19 inline citation recommendation task. The CRB-HDGCN contains two main stages. The first stage is to enhance the representation learning of BERT model for COVID-19 inline citation recommendation task through CRB. To achieve the above goal, an augmented citation sentence corpus, which replaces the citation placeholder with the title of the cited papers, is used to lightly retrain BERT model. In addition, we extract three types of sentence pair according citation relation, and establish sentence prediction tasks to further fine-tune the BERT model. The second stage is to learn effective dense vector of nodes among COVID-19 bibliographic graph through HDGCN. The HDGCN contains four layers which are essentially all sub neural networks. The first layer is initial embedding layer which generates initial input vectors with fixed size through CRB and a multilayer perceptron. The second layer is a heterogeneous graph convolutional layer. In this layer, we expand traditional homogeneous graph convolutional network into heterogeneous by subtly adding heterogeneous nodes and relations. The third layer is a deep attention layer. This layer uses trainable project vectors to reweight the node importance simultaneously according to both node types and convolution layers, which further promotes the performance of learnt node vectors. The last decoder layer recovers the graph structure and let the whole network trainable. The recommendation is finally achieved by integrating the high performance heterogeneous vectors learnt from CRB-HDGCN with the query vectors. We conduct experiments on the CORD-19 and LitCovid datasets. The results show that compared with the second best method CO-Search, CRB-HDGCN improves MAP, MRR, P@100 and R@100 with 21.8%, 22.7%, 37.6% and 21.2% on CORD-19, and 29.1%, 25.9%, 15.3% and 11.3% on LitCovid, respectively. Elsevier Ltd. 2023-03-01 2022-09-17 /pmc/articles/PMC9482209/ /pubmed/36157791 http://dx.doi.org/10.1016/j.eswa.2022.118841 Text en © 2022 Elsevier Ltd. All rights reserved. Since January 2020 Elsevier has created a COVID-19 resource centre with free information in English and Mandarin on the novel coronavirus COVID-19. The COVID-19 resource centre is hosted on Elsevier Connect, the company's public news and information website. Elsevier hereby grants permission to make all its COVID-19-related research that is available on the COVID-19 resource centre - including this research content - immediately available in PubMed Central and other publicly funded repositories, such as the WHO COVID database with rights for unrestricted research re-use and analyses in any form or by any means with acknowledgement of the original source. These permissions are granted for free by Elsevier for as long as the COVID-19 resource centre remains active.
spellingShingle Article
Dai, Tao
Zhao, Jie
Li, Dehong
Tian, Shun
Zhao, Xiangmo
Pan, Shirui
Heterogeneous deep graph convolutional network with citation relational BERT for COVID-19 inline citation recommendation
title Heterogeneous deep graph convolutional network with citation relational BERT for COVID-19 inline citation recommendation
title_full Heterogeneous deep graph convolutional network with citation relational BERT for COVID-19 inline citation recommendation
title_fullStr Heterogeneous deep graph convolutional network with citation relational BERT for COVID-19 inline citation recommendation
title_full_unstemmed Heterogeneous deep graph convolutional network with citation relational BERT for COVID-19 inline citation recommendation
title_short Heterogeneous deep graph convolutional network with citation relational BERT for COVID-19 inline citation recommendation
title_sort heterogeneous deep graph convolutional network with citation relational bert for covid-19 inline citation recommendation
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9482209/
https://www.ncbi.nlm.nih.gov/pubmed/36157791
http://dx.doi.org/10.1016/j.eswa.2022.118841
work_keys_str_mv AT daitao heterogeneousdeepgraphconvolutionalnetworkwithcitationrelationalbertforcovid19inlinecitationrecommendation
AT zhaojie heterogeneousdeepgraphconvolutionalnetworkwithcitationrelationalbertforcovid19inlinecitationrecommendation
AT lidehong heterogeneousdeepgraphconvolutionalnetworkwithcitationrelationalbertforcovid19inlinecitationrecommendation
AT tianshun heterogeneousdeepgraphconvolutionalnetworkwithcitationrelationalbertforcovid19inlinecitationrecommendation
AT zhaoxiangmo heterogeneousdeepgraphconvolutionalnetworkwithcitationrelationalbertforcovid19inlinecitationrecommendation
AT panshirui heterogeneousdeepgraphconvolutionalnetworkwithcitationrelationalbertforcovid19inlinecitationrecommendation