Cargando…

Pseudo2GO: A Graph-Based Deep Learning Method for Pseudogene Function Prediction by Borrowing Information From Coding Genes

Pseudogenes are indicating more and more functional potentials recently, though historically were regarded as relics of evolution. Computational methods for predicting pseudogene functions on Gene Ontology is important for directing experimental discovery. However, no pseudogene-specific computation...

Descripción completa

Detalles Bibliográficos
Autores principales: Fan, Kunjie, Zhang, Yan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7461887/
https://www.ncbi.nlm.nih.gov/pubmed/33014009
http://dx.doi.org/10.3389/fgene.2020.00807
_version_ 1783576812850249728
author Fan, Kunjie
Zhang, Yan
author_facet Fan, Kunjie
Zhang, Yan
author_sort Fan, Kunjie
collection PubMed
description Pseudogenes are indicating more and more functional potentials recently, though historically were regarded as relics of evolution. Computational methods for predicting pseudogene functions on Gene Ontology is important for directing experimental discovery. However, no pseudogene-specific computational methods have been proposed to directly predict their Gene Ontology (GO) terms. The biggest challenge for pseudogene function prediction is the lack of enough features and functional annotations, making training a predictive model difficult. Considering the close functional similarity between pseudogenes and their parent coding genes that share great amount of DNA sequence, as well as that coding genes have rich annotations, we aim to predict pseudogene functions by borrowing information from coding genes in a graph-based way. Here we propose Pseudo2GO, a graph-based deep learning semi-supervised model for pseudogene function prediction. A sequence similarity graph is first constructed to connect pseudogenes and coding genes. Multiple features are incorporated into the model as the node attributes to enable the graph an attributed graph, including expression profiles, interactions with microRNAs, protein-protein interactions (PPIs), and genetic interactions. Graph convolutional networks are used to propagate node attributes across the graph to make classifications on pseudogenes. Comparing Pseudo2GO with other frameworks adapted from popular protein function prediction methods, we demonstrated that our method has achieved state-of-the-art performance, significantly outperforming other methods in terms of the M-AUPR metric.
format Online
Article
Text
id pubmed-7461887
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-74618872020-10-01 Pseudo2GO: A Graph-Based Deep Learning Method for Pseudogene Function Prediction by Borrowing Information From Coding Genes Fan, Kunjie Zhang, Yan Front Genet Genetics Pseudogenes are indicating more and more functional potentials recently, though historically were regarded as relics of evolution. Computational methods for predicting pseudogene functions on Gene Ontology is important for directing experimental discovery. However, no pseudogene-specific computational methods have been proposed to directly predict their Gene Ontology (GO) terms. The biggest challenge for pseudogene function prediction is the lack of enough features and functional annotations, making training a predictive model difficult. Considering the close functional similarity between pseudogenes and their parent coding genes that share great amount of DNA sequence, as well as that coding genes have rich annotations, we aim to predict pseudogene functions by borrowing information from coding genes in a graph-based way. Here we propose Pseudo2GO, a graph-based deep learning semi-supervised model for pseudogene function prediction. A sequence similarity graph is first constructed to connect pseudogenes and coding genes. Multiple features are incorporated into the model as the node attributes to enable the graph an attributed graph, including expression profiles, interactions with microRNAs, protein-protein interactions (PPIs), and genetic interactions. Graph convolutional networks are used to propagate node attributes across the graph to make classifications on pseudogenes. Comparing Pseudo2GO with other frameworks adapted from popular protein function prediction methods, we demonstrated that our method has achieved state-of-the-art performance, significantly outperforming other methods in terms of the M-AUPR metric. Frontiers Media S.A. 2020-08-18 /pmc/articles/PMC7461887/ /pubmed/33014009 http://dx.doi.org/10.3389/fgene.2020.00807 Text en Copyright © 2020 Fan and Zhang. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Fan, Kunjie
Zhang, Yan
Pseudo2GO: A Graph-Based Deep Learning Method for Pseudogene Function Prediction by Borrowing Information From Coding Genes
title Pseudo2GO: A Graph-Based Deep Learning Method for Pseudogene Function Prediction by Borrowing Information From Coding Genes
title_full Pseudo2GO: A Graph-Based Deep Learning Method for Pseudogene Function Prediction by Borrowing Information From Coding Genes
title_fullStr Pseudo2GO: A Graph-Based Deep Learning Method for Pseudogene Function Prediction by Borrowing Information From Coding Genes
title_full_unstemmed Pseudo2GO: A Graph-Based Deep Learning Method for Pseudogene Function Prediction by Borrowing Information From Coding Genes
title_short Pseudo2GO: A Graph-Based Deep Learning Method for Pseudogene Function Prediction by Borrowing Information From Coding Genes
title_sort pseudo2go: a graph-based deep learning method for pseudogene function prediction by borrowing information from coding genes
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7461887/
https://www.ncbi.nlm.nih.gov/pubmed/33014009
http://dx.doi.org/10.3389/fgene.2020.00807
work_keys_str_mv AT fankunjie pseudo2goagraphbaseddeeplearningmethodforpseudogenefunctionpredictionbyborrowinginformationfromcodinggenes
AT zhangyan pseudo2goagraphbaseddeeplearningmethodforpseudogenefunctionpredictionbyborrowinginformationfromcodinggenes