Cargando…

Benefits and Challenges of Pre-clustered Network-Based Pathway Analysis

Functional analysis of gene sets derived from experiments is typically done by pathway annotation. Although many algorithms exist for analyzing the association between a gene set and a pathway, an issue which is generally ignored is that gene sets often represent multiple pathways. In such cases an...

Descripción completa

Detalles Bibliográficos
Autores principales: Castresana-Aguirre, Miguel, Guala, Dimitri, Sonnhammer, Erik L. L.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9127507/
https://www.ncbi.nlm.nih.gov/pubmed/35620466
http://dx.doi.org/10.3389/fgene.2022.855766
_version_ 1784712368236789760
author Castresana-Aguirre, Miguel
Guala, Dimitri
Sonnhammer, Erik L. L.
author_facet Castresana-Aguirre, Miguel
Guala, Dimitri
Sonnhammer, Erik L. L.
author_sort Castresana-Aguirre, Miguel
collection PubMed
description Functional analysis of gene sets derived from experiments is typically done by pathway annotation. Although many algorithms exist for analyzing the association between a gene set and a pathway, an issue which is generally ignored is that gene sets often represent multiple pathways. In such cases an association to a pathway is weakened by the presence of genes associated with other pathways. A way to counteract this is to cluster the gene set into more homogenous parts before performing pathway analysis on each module. We explored whether network-based pre-clustering of a query gene set can improve pathway analysis. The methods MCL, Infomap, and MGclus were used to cluster the gene set projected onto the FunCoup network. We characterized how well these methods are able to detect individual pathways in multi-pathway gene sets, and applied each of the clustering methods in combination with four pathway analysis methods: Gene Enrichment Analysis, BinoX, NEAT, and ANUBIX. Using benchmarks constructed from the KEGG pathway database we found that clustering can be beneficial by increasing the sensitivity of pathway analysis methods and by providing deeper insights of biological mechanisms related to the phenotype under study. However, keeping a high specificity is a challenge. For ANUBIX, clustering caused a minor loss of specificity, while for BinoX and NEAT it caused an unacceptable loss of specificity. GEA had very low sensitivity both before and after clustering. The choice of clustering method only had a minor effect on the results. We show examples of this approach and conclude that clustering can improve overall pathway annotation performance, but should only be used if the used enrichment method has a low false positive rate.
format Online
Article
Text
id pubmed-9127507
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-91275072022-05-25 Benefits and Challenges of Pre-clustered Network-Based Pathway Analysis Castresana-Aguirre, Miguel Guala, Dimitri Sonnhammer, Erik L. L. Front Genet Genetics Functional analysis of gene sets derived from experiments is typically done by pathway annotation. Although many algorithms exist for analyzing the association between a gene set and a pathway, an issue which is generally ignored is that gene sets often represent multiple pathways. In such cases an association to a pathway is weakened by the presence of genes associated with other pathways. A way to counteract this is to cluster the gene set into more homogenous parts before performing pathway analysis on each module. We explored whether network-based pre-clustering of a query gene set can improve pathway analysis. The methods MCL, Infomap, and MGclus were used to cluster the gene set projected onto the FunCoup network. We characterized how well these methods are able to detect individual pathways in multi-pathway gene sets, and applied each of the clustering methods in combination with four pathway analysis methods: Gene Enrichment Analysis, BinoX, NEAT, and ANUBIX. Using benchmarks constructed from the KEGG pathway database we found that clustering can be beneficial by increasing the sensitivity of pathway analysis methods and by providing deeper insights of biological mechanisms related to the phenotype under study. However, keeping a high specificity is a challenge. For ANUBIX, clustering caused a minor loss of specificity, while for BinoX and NEAT it caused an unacceptable loss of specificity. GEA had very low sensitivity both before and after clustering. The choice of clustering method only had a minor effect on the results. We show examples of this approach and conclude that clustering can improve overall pathway annotation performance, but should only be used if the used enrichment method has a low false positive rate. Frontiers Media S.A. 2022-05-10 /pmc/articles/PMC9127507/ /pubmed/35620466 http://dx.doi.org/10.3389/fgene.2022.855766 Text en Copyright © 2022 Castresana-Aguirre, Guala and Sonnhammer. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Castresana-Aguirre, Miguel
Guala, Dimitri
Sonnhammer, Erik L. L.
Benefits and Challenges of Pre-clustered Network-Based Pathway Analysis
title Benefits and Challenges of Pre-clustered Network-Based Pathway Analysis
title_full Benefits and Challenges of Pre-clustered Network-Based Pathway Analysis
title_fullStr Benefits and Challenges of Pre-clustered Network-Based Pathway Analysis
title_full_unstemmed Benefits and Challenges of Pre-clustered Network-Based Pathway Analysis
title_short Benefits and Challenges of Pre-clustered Network-Based Pathway Analysis
title_sort benefits and challenges of pre-clustered network-based pathway analysis
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9127507/
https://www.ncbi.nlm.nih.gov/pubmed/35620466
http://dx.doi.org/10.3389/fgene.2022.855766
work_keys_str_mv AT castresanaaguirremiguel benefitsandchallengesofpreclusterednetworkbasedpathwayanalysis
AT gualadimitri benefitsandchallengesofpreclusterednetworkbasedpathwayanalysis
AT sonnhammererikll benefitsandchallengesofpreclusterednetworkbasedpathwayanalysis