Cargando…

Identifying lncRNA–disease association based on GAT multiple-operator aggregation and inductive matrix completion

Computable models as a fundamental candidate for traditional biological experiments have been applied in inferring lncRNA–disease association (LDA) for many years, without time-consuming and laborious limitations. However, sparsity inherently existing in known heterogeneous bio-data is an obstacle t...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Yi, Wang, Yu, Li, Xin, Liu, Yarong, Chen, Min
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9631210/
https://www.ncbi.nlm.nih.gov/pubmed/36338997
http://dx.doi.org/10.3389/fgene.2022.1029300
_version_ 1784823770445250560
author Zhang, Yi
Wang, Yu
Li, Xin
Liu, Yarong
Chen, Min
author_facet Zhang, Yi
Wang, Yu
Li, Xin
Liu, Yarong
Chen, Min
author_sort Zhang, Yi
collection PubMed
description Computable models as a fundamental candidate for traditional biological experiments have been applied in inferring lncRNA–disease association (LDA) for many years, without time-consuming and laborious limitations. However, sparsity inherently existing in known heterogeneous bio-data is an obstacle to computable models to improve prediction accuracy further. Therefore, a new computational model composed of multiple mechanisms for lncRNA–disease association (MM-LDA) prediction was proposed, based on the fusion of the graph attention network (GAT) and inductive matrix completion (IMC). MM-LDA has two key steps to improve prediction accuracy: first, a multiple-operator aggregation was designed in the n-heads attention mechanism of the GAT. With this step, features of lncRNA nodes and disease nodes were enhanced. Second, IMC was introduced into the enhanced node features obtained in the first step, and then the LDA network was reconstructed to solve the cold start problem when data deficiency of the entire row or column happened in a known association matrix. Our MM-LDA achieved the following progress: first, using the Adam optimizer that adaptively adjusted the model learning rate could increase the convergent speed and not fall into local optima as well. Second, more excellent predictive ability was achieved against other similar models (with an AUC value of 0.9395 and an AUPR value of 0.8057 obtained from 5-fold cross-validation). Third, a 6.45% lower time cost was consumed against the advanced model GAMCLDA. In short, our MM-LDA achieved a more comprehensive prediction performance in terms of prediction accuracy and time cost.
format Online
Article
Text
id pubmed-9631210
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-96312102022-11-04 Identifying lncRNA–disease association based on GAT multiple-operator aggregation and inductive matrix completion Zhang, Yi Wang, Yu Li, Xin Liu, Yarong Chen, Min Front Genet Genetics Computable models as a fundamental candidate for traditional biological experiments have been applied in inferring lncRNA–disease association (LDA) for many years, without time-consuming and laborious limitations. However, sparsity inherently existing in known heterogeneous bio-data is an obstacle to computable models to improve prediction accuracy further. Therefore, a new computational model composed of multiple mechanisms for lncRNA–disease association (MM-LDA) prediction was proposed, based on the fusion of the graph attention network (GAT) and inductive matrix completion (IMC). MM-LDA has two key steps to improve prediction accuracy: first, a multiple-operator aggregation was designed in the n-heads attention mechanism of the GAT. With this step, features of lncRNA nodes and disease nodes were enhanced. Second, IMC was introduced into the enhanced node features obtained in the first step, and then the LDA network was reconstructed to solve the cold start problem when data deficiency of the entire row or column happened in a known association matrix. Our MM-LDA achieved the following progress: first, using the Adam optimizer that adaptively adjusted the model learning rate could increase the convergent speed and not fall into local optima as well. Second, more excellent predictive ability was achieved against other similar models (with an AUC value of 0.9395 and an AUPR value of 0.8057 obtained from 5-fold cross-validation). Third, a 6.45% lower time cost was consumed against the advanced model GAMCLDA. In short, our MM-LDA achieved a more comprehensive prediction performance in terms of prediction accuracy and time cost. Frontiers Media S.A. 2022-10-20 /pmc/articles/PMC9631210/ /pubmed/36338997 http://dx.doi.org/10.3389/fgene.2022.1029300 Text en Copyright © 2022 Zhang, Wang, Li, Liu and Chen. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Zhang, Yi
Wang, Yu
Li, Xin
Liu, Yarong
Chen, Min
Identifying lncRNA–disease association based on GAT multiple-operator aggregation and inductive matrix completion
title Identifying lncRNA–disease association based on GAT multiple-operator aggregation and inductive matrix completion
title_full Identifying lncRNA–disease association based on GAT multiple-operator aggregation and inductive matrix completion
title_fullStr Identifying lncRNA–disease association based on GAT multiple-operator aggregation and inductive matrix completion
title_full_unstemmed Identifying lncRNA–disease association based on GAT multiple-operator aggregation and inductive matrix completion
title_short Identifying lncRNA–disease association based on GAT multiple-operator aggregation and inductive matrix completion
title_sort identifying lncrna–disease association based on gat multiple-operator aggregation and inductive matrix completion
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9631210/
https://www.ncbi.nlm.nih.gov/pubmed/36338997
http://dx.doi.org/10.3389/fgene.2022.1029300
work_keys_str_mv AT zhangyi identifyinglncrnadiseaseassociationbasedongatmultipleoperatoraggregationandinductivematrixcompletion
AT wangyu identifyinglncrnadiseaseassociationbasedongatmultipleoperatoraggregationandinductivematrixcompletion
AT lixin identifyinglncrnadiseaseassociationbasedongatmultipleoperatoraggregationandinductivematrixcompletion
AT liuyarong identifyinglncrnadiseaseassociationbasedongatmultipleoperatoraggregationandinductivematrixcompletion
AT chenmin identifyinglncrnadiseaseassociationbasedongatmultipleoperatoraggregationandinductivematrixcompletion