Cargando…

Predicting Pseudogene–miRNA Associations Based on Feature Fusion and Graph Auto-Encoder

Pseudogenes were originally regarded as non-functional components scattered in the genome during evolution. Recent studies have shown that pseudogenes can be transcribed into long non-coding RNA and play a key role at multiple functional levels in different physiological and pathological processes....

Descripción completa

Detalles Bibliográficos
Autores principales: Zhou, Shijia, Sun, Weicheng, Zhang, Ping, Li, Li
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8710693/
https://www.ncbi.nlm.nih.gov/pubmed/34966413
http://dx.doi.org/10.3389/fgene.2021.781277
_version_ 1784623214878523392
author Zhou, Shijia
Sun, Weicheng
Zhang, Ping
Li, Li
author_facet Zhou, Shijia
Sun, Weicheng
Zhang, Ping
Li, Li
author_sort Zhou, Shijia
collection PubMed
description Pseudogenes were originally regarded as non-functional components scattered in the genome during evolution. Recent studies have shown that pseudogenes can be transcribed into long non-coding RNA and play a key role at multiple functional levels in different physiological and pathological processes. microRNAs (miRNAs) are a type of non-coding RNA, which plays important regulatory roles in cells. Numerous studies have shown that pseudogenes and miRNAs have interactions and form a ceRNA network with mRNA to regulate biological processes and involve diseases. Exploring the associations of pseudogenes and miRNAs will facilitate the clinical diagnosis of some diseases. Here, we propose a prediction model PMGAE (Pseudogene–MiRNA association prediction based on the Graph Auto-Encoder), which incorporates feature fusion, graph auto-encoder (GAE), and eXtreme Gradient Boosting (XGBoost). First, we calculated three types of similarities including Jaccard similarity, cosine similarity, and Pearson similarity between nodes based on the biological characteristics of pseudogenes and miRNAs. Subsequently, we fused the above similarities to construct a similarity profile as the initial representation features for nodes. Then, we aggregated the similarity profiles and associations of nodes to obtain the low-dimensional representation vector of nodes through a GAE. In the last step, we fed these representation vectors into an XGBoost classifier to predict new pseudogene–miRNA associations (PMAs). The results of five-fold cross validation show that PMGAE achieves a mean AUC of 0.8634 and mean AUPR of 0.8966. Case studies further substantiated the reliability of PMGAE for mining PMAs and the study of endogenous RNA networks in relation to diseases.
format Online
Article
Text
id pubmed-8710693
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-87106932021-12-28 Predicting Pseudogene–miRNA Associations Based on Feature Fusion and Graph Auto-Encoder Zhou, Shijia Sun, Weicheng Zhang, Ping Li, Li Front Genet Genetics Pseudogenes were originally regarded as non-functional components scattered in the genome during evolution. Recent studies have shown that pseudogenes can be transcribed into long non-coding RNA and play a key role at multiple functional levels in different physiological and pathological processes. microRNAs (miRNAs) are a type of non-coding RNA, which plays important regulatory roles in cells. Numerous studies have shown that pseudogenes and miRNAs have interactions and form a ceRNA network with mRNA to regulate biological processes and involve diseases. Exploring the associations of pseudogenes and miRNAs will facilitate the clinical diagnosis of some diseases. Here, we propose a prediction model PMGAE (Pseudogene–MiRNA association prediction based on the Graph Auto-Encoder), which incorporates feature fusion, graph auto-encoder (GAE), and eXtreme Gradient Boosting (XGBoost). First, we calculated three types of similarities including Jaccard similarity, cosine similarity, and Pearson similarity between nodes based on the biological characteristics of pseudogenes and miRNAs. Subsequently, we fused the above similarities to construct a similarity profile as the initial representation features for nodes. Then, we aggregated the similarity profiles and associations of nodes to obtain the low-dimensional representation vector of nodes through a GAE. In the last step, we fed these representation vectors into an XGBoost classifier to predict new pseudogene–miRNA associations (PMAs). The results of five-fold cross validation show that PMGAE achieves a mean AUC of 0.8634 and mean AUPR of 0.8966. Case studies further substantiated the reliability of PMGAE for mining PMAs and the study of endogenous RNA networks in relation to diseases. Frontiers Media S.A. 2021-12-13 /pmc/articles/PMC8710693/ /pubmed/34966413 http://dx.doi.org/10.3389/fgene.2021.781277 Text en Copyright © 2021 Zhou, Sun, Zhang and Li. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Zhou, Shijia
Sun, Weicheng
Zhang, Ping
Li, Li
Predicting Pseudogene–miRNA Associations Based on Feature Fusion and Graph Auto-Encoder
title Predicting Pseudogene–miRNA Associations Based on Feature Fusion and Graph Auto-Encoder
title_full Predicting Pseudogene–miRNA Associations Based on Feature Fusion and Graph Auto-Encoder
title_fullStr Predicting Pseudogene–miRNA Associations Based on Feature Fusion and Graph Auto-Encoder
title_full_unstemmed Predicting Pseudogene–miRNA Associations Based on Feature Fusion and Graph Auto-Encoder
title_short Predicting Pseudogene–miRNA Associations Based on Feature Fusion and Graph Auto-Encoder
title_sort predicting pseudogene–mirna associations based on feature fusion and graph auto-encoder
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8710693/
https://www.ncbi.nlm.nih.gov/pubmed/34966413
http://dx.doi.org/10.3389/fgene.2021.781277
work_keys_str_mv AT zhoushijia predictingpseudogenemirnaassociationsbasedonfeaturefusionandgraphautoencoder
AT sunweicheng predictingpseudogenemirnaassociationsbasedonfeaturefusionandgraphautoencoder
AT zhangping predictingpseudogenemirnaassociationsbasedonfeaturefusionandgraphautoencoder
AT lili predictingpseudogenemirnaassociationsbasedonfeaturefusionandgraphautoencoder