Cargando…

deepSimDEF: deep neural embeddings of gene products and gene ontology terms for functional analysis of genes

MOTIVATION: There is a plethora of measures to evaluate functional similarity (FS) of genes based on their co-expression, protein–protein interactions and sequence similarity. These measures are typically derived from hand-engineered and application-specific metrics to quantify the degree of shared...

Descripción completa

Detalles Bibliográficos
Autores principales: Pesaranghader, Ahmad, Matwin, Stan, Sokolova, Marina, Grenier, Jean-Christophe, Beiko, Robert G, Hussin, Julie
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9154256/
https://www.ncbi.nlm.nih.gov/pubmed/35536192
http://dx.doi.org/10.1093/bioinformatics/btac304
_version_ 1784718004022411264
author Pesaranghader, Ahmad
Matwin, Stan
Sokolova, Marina
Grenier, Jean-Christophe
Beiko, Robert G
Hussin, Julie
author_facet Pesaranghader, Ahmad
Matwin, Stan
Sokolova, Marina
Grenier, Jean-Christophe
Beiko, Robert G
Hussin, Julie
author_sort Pesaranghader, Ahmad
collection PubMed
description MOTIVATION: There is a plethora of measures to evaluate functional similarity (FS) of genes based on their co-expression, protein–protein interactions and sequence similarity. These measures are typically derived from hand-engineered and application-specific metrics to quantify the degree of shared information between two genes using their Gene Ontology (GO) annotations. RESULTS: We introduce deepSimDEF, a deep learning method to automatically learn FS estimation of gene pairs given a set of genes and their GO annotations. deepSimDEF’s key novelty is its ability to learn low-dimensional embedding vector representations of GO terms and gene products and then calculate FS using these learned vectors. We show that deepSimDEF can predict the FS of new genes using their annotations: it outperformed all other FS measures by >5–10% on yeast and human reference datasets on protein–protein interactions, gene co-expression and sequence homology tasks. Thus, deepSimDEF offers a powerful and adaptable deep neural architecture that can benefit a wide range of problems in genomics and proteomics, and its architecture is flexible enough to support its extension to any organism. AVAILABILITY AND IMPLEMENTATION: Source code and data are available at https://github.com/ahmadpgh/deepSimDEF SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-9154256
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-91542562022-06-04 deepSimDEF: deep neural embeddings of gene products and gene ontology terms for functional analysis of genes Pesaranghader, Ahmad Matwin, Stan Sokolova, Marina Grenier, Jean-Christophe Beiko, Robert G Hussin, Julie Bioinformatics Original Papers MOTIVATION: There is a plethora of measures to evaluate functional similarity (FS) of genes based on their co-expression, protein–protein interactions and sequence similarity. These measures are typically derived from hand-engineered and application-specific metrics to quantify the degree of shared information between two genes using their Gene Ontology (GO) annotations. RESULTS: We introduce deepSimDEF, a deep learning method to automatically learn FS estimation of gene pairs given a set of genes and their GO annotations. deepSimDEF’s key novelty is its ability to learn low-dimensional embedding vector representations of GO terms and gene products and then calculate FS using these learned vectors. We show that deepSimDEF can predict the FS of new genes using their annotations: it outperformed all other FS measures by >5–10% on yeast and human reference datasets on protein–protein interactions, gene co-expression and sequence homology tasks. Thus, deepSimDEF offers a powerful and adaptable deep neural architecture that can benefit a wide range of problems in genomics and proteomics, and its architecture is flexible enough to support its extension to any organism. AVAILABILITY AND IMPLEMENTATION: Source code and data are available at https://github.com/ahmadpgh/deepSimDEF SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2022-05-10 /pmc/articles/PMC9154256/ /pubmed/35536192 http://dx.doi.org/10.1093/bioinformatics/btac304 Text en © The Author(s) 2022. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Papers
Pesaranghader, Ahmad
Matwin, Stan
Sokolova, Marina
Grenier, Jean-Christophe
Beiko, Robert G
Hussin, Julie
deepSimDEF: deep neural embeddings of gene products and gene ontology terms for functional analysis of genes
title deepSimDEF: deep neural embeddings of gene products and gene ontology terms for functional analysis of genes
title_full deepSimDEF: deep neural embeddings of gene products and gene ontology terms for functional analysis of genes
title_fullStr deepSimDEF: deep neural embeddings of gene products and gene ontology terms for functional analysis of genes
title_full_unstemmed deepSimDEF: deep neural embeddings of gene products and gene ontology terms for functional analysis of genes
title_short deepSimDEF: deep neural embeddings of gene products and gene ontology terms for functional analysis of genes
title_sort deepsimdef: deep neural embeddings of gene products and gene ontology terms for functional analysis of genes
topic Original Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9154256/
https://www.ncbi.nlm.nih.gov/pubmed/35536192
http://dx.doi.org/10.1093/bioinformatics/btac304
work_keys_str_mv AT pesaranghaderahmad deepsimdefdeepneuralembeddingsofgeneproductsandgeneontologytermsforfunctionalanalysisofgenes
AT matwinstan deepsimdefdeepneuralembeddingsofgeneproductsandgeneontologytermsforfunctionalanalysisofgenes
AT sokolovamarina deepsimdefdeepneuralembeddingsofgeneproductsandgeneontologytermsforfunctionalanalysisofgenes
AT grenierjeanchristophe deepsimdefdeepneuralembeddingsofgeneproductsandgeneontologytermsforfunctionalanalysisofgenes
AT beikorobertg deepsimdefdeepneuralembeddingsofgeneproductsandgeneontologytermsforfunctionalanalysisofgenes
AT hussinjulie deepsimdefdeepneuralembeddingsofgeneproductsandgeneontologytermsforfunctionalanalysisofgenes