Cargando…

Inferring Latent Disease-lncRNA Associations by Faster Matrix Completion on a Heterogeneous Network

Current studies have shown that long non-coding RNAs (lncRNAs) play a crucial role in a variety of fundamental biological processes related to complex human diseases. The prediction of latent disease-lncRNA associations can help to understand the pathogenesis of complex human diseases at the level o...

Descripción completa

Detalles Bibliográficos
Autores principales: Li, Wen, Wang, Shulin, Xu, Junlin, Mao, Guo, Tian, Geng, Yang, Jialiang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6749816/
https://www.ncbi.nlm.nih.gov/pubmed/31572428
http://dx.doi.org/10.3389/fgene.2019.00769
_version_ 1783452355380903936
author Li, Wen
Wang, Shulin
Xu, Junlin
Mao, Guo
Tian, Geng
Yang, Jialiang
author_facet Li, Wen
Wang, Shulin
Xu, Junlin
Mao, Guo
Tian, Geng
Yang, Jialiang
author_sort Li, Wen
collection PubMed
description Current studies have shown that long non-coding RNAs (lncRNAs) play a crucial role in a variety of fundamental biological processes related to complex human diseases. The prediction of latent disease-lncRNA associations can help to understand the pathogenesis of complex human diseases at the level of lncRNA, which also contributes to the detection of disease biomarkers, and the diagnosis, treatment, prognosis and prevention of disease. Nevertheless, it is still a challenging and urgent task to accurately identify latent disease-lncRNA association. Discovering latent links on the basis of biological experiments is time-consuming and wasteful, necessitating the development of computational prediction models. In this study, a computational prediction model has been remodeled as a matrix completion framework of the recommendation system by completing the unknown items in the rating matrix. A novel method named faster randomized matrix completion for latent disease-lncRNA association prediction (FRMCLDA) has been proposed by virtue of improved randomized partial SVD (rSVD-BKI) on a heterogeneous bilayer network. First, the correlated data source and experimentally validated information of diseases and lncRNAs are integrated to construct a heterogeneous bilayer network. Next, the integrated heterogeneous bilayer network can be formalized as a comprehensive adjacency matrix which includes lncRNA similarity matrix, disease similarity matrix, and disease-lncRNA association matrix where the uncertain disease-lncRNA associations are referred to as blank items. Then, a matrix approximate to the original adjacency matrix has been designed with predicted scores to retrieve the blank items. The construction of the approximate matrix could be equivalently resolved by the nuclear norm minimization. Finally, a faster singular value thresholding algorithm with a randomized partial SVD combing a new sub-space reuse technique has been utilized to complete the adjacency matrix. The results of leave-one-out cross-validation (LOOCV) experiments and 5-fold cross-validation (5-fold CV) experiments on three different benchmark databases have confirmed the availability and adaptability of FRMCLDA in inferring latent relationships of disease-lncRNA pairs, and in inferring lncRNAs correlated with novel diseases without any prior interaction information. Additionally, case studies have shown that FRMCLDA is able to effectively predict latent lncRNAs correlated with three widespread malignancies: prostate cancer, colon cancer, and gastric cancer.
format Online
Article
Text
id pubmed-6749816
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-67498162019-09-30 Inferring Latent Disease-lncRNA Associations by Faster Matrix Completion on a Heterogeneous Network Li, Wen Wang, Shulin Xu, Junlin Mao, Guo Tian, Geng Yang, Jialiang Front Genet Genetics Current studies have shown that long non-coding RNAs (lncRNAs) play a crucial role in a variety of fundamental biological processes related to complex human diseases. The prediction of latent disease-lncRNA associations can help to understand the pathogenesis of complex human diseases at the level of lncRNA, which also contributes to the detection of disease biomarkers, and the diagnosis, treatment, prognosis and prevention of disease. Nevertheless, it is still a challenging and urgent task to accurately identify latent disease-lncRNA association. Discovering latent links on the basis of biological experiments is time-consuming and wasteful, necessitating the development of computational prediction models. In this study, a computational prediction model has been remodeled as a matrix completion framework of the recommendation system by completing the unknown items in the rating matrix. A novel method named faster randomized matrix completion for latent disease-lncRNA association prediction (FRMCLDA) has been proposed by virtue of improved randomized partial SVD (rSVD-BKI) on a heterogeneous bilayer network. First, the correlated data source and experimentally validated information of diseases and lncRNAs are integrated to construct a heterogeneous bilayer network. Next, the integrated heterogeneous bilayer network can be formalized as a comprehensive adjacency matrix which includes lncRNA similarity matrix, disease similarity matrix, and disease-lncRNA association matrix where the uncertain disease-lncRNA associations are referred to as blank items. Then, a matrix approximate to the original adjacency matrix has been designed with predicted scores to retrieve the blank items. The construction of the approximate matrix could be equivalently resolved by the nuclear norm minimization. Finally, a faster singular value thresholding algorithm with a randomized partial SVD combing a new sub-space reuse technique has been utilized to complete the adjacency matrix. The results of leave-one-out cross-validation (LOOCV) experiments and 5-fold cross-validation (5-fold CV) experiments on three different benchmark databases have confirmed the availability and adaptability of FRMCLDA in inferring latent relationships of disease-lncRNA pairs, and in inferring lncRNAs correlated with novel diseases without any prior interaction information. Additionally, case studies have shown that FRMCLDA is able to effectively predict latent lncRNAs correlated with three widespread malignancies: prostate cancer, colon cancer, and gastric cancer. Frontiers Media S.A. 2019-09-04 /pmc/articles/PMC6749816/ /pubmed/31572428 http://dx.doi.org/10.3389/fgene.2019.00769 Text en Copyright © 2019 Li, Wang, Xu, Mao, Tian and Yang http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Li, Wen
Wang, Shulin
Xu, Junlin
Mao, Guo
Tian, Geng
Yang, Jialiang
Inferring Latent Disease-lncRNA Associations by Faster Matrix Completion on a Heterogeneous Network
title Inferring Latent Disease-lncRNA Associations by Faster Matrix Completion on a Heterogeneous Network
title_full Inferring Latent Disease-lncRNA Associations by Faster Matrix Completion on a Heterogeneous Network
title_fullStr Inferring Latent Disease-lncRNA Associations by Faster Matrix Completion on a Heterogeneous Network
title_full_unstemmed Inferring Latent Disease-lncRNA Associations by Faster Matrix Completion on a Heterogeneous Network
title_short Inferring Latent Disease-lncRNA Associations by Faster Matrix Completion on a Heterogeneous Network
title_sort inferring latent disease-lncrna associations by faster matrix completion on a heterogeneous network
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6749816/
https://www.ncbi.nlm.nih.gov/pubmed/31572428
http://dx.doi.org/10.3389/fgene.2019.00769
work_keys_str_mv AT liwen inferringlatentdiseaselncrnaassociationsbyfastermatrixcompletiononaheterogeneousnetwork
AT wangshulin inferringlatentdiseaselncrnaassociationsbyfastermatrixcompletiononaheterogeneousnetwork
AT xujunlin inferringlatentdiseaselncrnaassociationsbyfastermatrixcompletiononaheterogeneousnetwork
AT maoguo inferringlatentdiseaselncrnaassociationsbyfastermatrixcompletiononaheterogeneousnetwork
AT tiangeng inferringlatentdiseaselncrnaassociationsbyfastermatrixcompletiononaheterogeneousnetwork
AT yangjialiang inferringlatentdiseaselncrnaassociationsbyfastermatrixcompletiononaheterogeneousnetwork