Cargando…

TopoFun: a machine learning method to improve the functional similarity of gene co-expression modules

A comprehensive, accurate functional annotation of genes is key to systems-level approaches. As functionally related genes tend to be co-expressed, one possible approach to identify functional modules or supplement existing gene annotations is to analyse gene co-expression. We describe TopoFun, a ma...

Descripción completa

Detalles Bibliográficos
Autores principales: Janbain, Ali, Reynès, Christelle, Assaghir, Zainab, Zeineddine, Hassan, Sabatier, Robert, Journot, Laurent
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8573820/
https://www.ncbi.nlm.nih.gov/pubmed/34761220
http://dx.doi.org/10.1093/nargab/lqab103
_version_ 1784595495457390592
author Janbain, Ali
Reynès, Christelle
Assaghir, Zainab
Zeineddine, Hassan
Sabatier, Robert
Journot, Laurent
author_facet Janbain, Ali
Reynès, Christelle
Assaghir, Zainab
Zeineddine, Hassan
Sabatier, Robert
Journot, Laurent
author_sort Janbain, Ali
collection PubMed
description A comprehensive, accurate functional annotation of genes is key to systems-level approaches. As functionally related genes tend to be co-expressed, one possible approach to identify functional modules or supplement existing gene annotations is to analyse gene co-expression. We describe TopoFun, a machine learning method that combines topological and functional information to improve the functional similarity of gene co-expression modules. Using LASSO, we selected topological descriptors that discriminated modules made of functionally related genes and random modules. Using the selected topological descriptors, we performed linear discriminant analysis to construct a topological score that predicted the type of a module, random-like or functional-like. We combined the topological score with a functional similarity score in a fitness function that we used in a genetic algorithm to explore the co-expression network. To illustrate the use of TopoFun, we started from a subset of the Gene Ontology Biological Processes (GO-BPs) and showed that TopoFun efficiently retrieved genes that we omitted, and aggregated a number of novel genes to the initial GO-BP while improving module topology and functional similarity. Using an independent protein-protein interaction database, we confirmed that the novel genes gathered by TopoFun were functionally related to the original gene set.
format Online
Article
Text
id pubmed-8573820
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-85738202021-11-09 TopoFun: a machine learning method to improve the functional similarity of gene co-expression modules Janbain, Ali Reynès, Christelle Assaghir, Zainab Zeineddine, Hassan Sabatier, Robert Journot, Laurent NAR Genom Bioinform Methart A comprehensive, accurate functional annotation of genes is key to systems-level approaches. As functionally related genes tend to be co-expressed, one possible approach to identify functional modules or supplement existing gene annotations is to analyse gene co-expression. We describe TopoFun, a machine learning method that combines topological and functional information to improve the functional similarity of gene co-expression modules. Using LASSO, we selected topological descriptors that discriminated modules made of functionally related genes and random modules. Using the selected topological descriptors, we performed linear discriminant analysis to construct a topological score that predicted the type of a module, random-like or functional-like. We combined the topological score with a functional similarity score in a fitness function that we used in a genetic algorithm to explore the co-expression network. To illustrate the use of TopoFun, we started from a subset of the Gene Ontology Biological Processes (GO-BPs) and showed that TopoFun efficiently retrieved genes that we omitted, and aggregated a number of novel genes to the initial GO-BP while improving module topology and functional similarity. Using an independent protein-protein interaction database, we confirmed that the novel genes gathered by TopoFun were functionally related to the original gene set. Oxford University Press 2021-11-08 /pmc/articles/PMC8573820/ /pubmed/34761220 http://dx.doi.org/10.1093/nargab/lqab103 Text en © The Author(s) 2021. Published by Oxford University Press on behalf of NAR Genomics and Bioinformatics. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Methart
Janbain, Ali
Reynès, Christelle
Assaghir, Zainab
Zeineddine, Hassan
Sabatier, Robert
Journot, Laurent
TopoFun: a machine learning method to improve the functional similarity of gene co-expression modules
title TopoFun: a machine learning method to improve the functional similarity of gene co-expression modules
title_full TopoFun: a machine learning method to improve the functional similarity of gene co-expression modules
title_fullStr TopoFun: a machine learning method to improve the functional similarity of gene co-expression modules
title_full_unstemmed TopoFun: a machine learning method to improve the functional similarity of gene co-expression modules
title_short TopoFun: a machine learning method to improve the functional similarity of gene co-expression modules
title_sort topofun: a machine learning method to improve the functional similarity of gene co-expression modules
topic Methart
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8573820/
https://www.ncbi.nlm.nih.gov/pubmed/34761220
http://dx.doi.org/10.1093/nargab/lqab103
work_keys_str_mv AT janbainali topofunamachinelearningmethodtoimprovethefunctionalsimilarityofgenecoexpressionmodules
AT reyneschristelle topofunamachinelearningmethodtoimprovethefunctionalsimilarityofgenecoexpressionmodules
AT assaghirzainab topofunamachinelearningmethodtoimprovethefunctionalsimilarityofgenecoexpressionmodules
AT zeineddinehassan topofunamachinelearningmethodtoimprovethefunctionalsimilarityofgenecoexpressionmodules
AT sabatierrobert topofunamachinelearningmethodtoimprovethefunctionalsimilarityofgenecoexpressionmodules
AT journotlaurent topofunamachinelearningmethodtoimprovethefunctionalsimilarityofgenecoexpressionmodules