Cargando…
Benefits and Challenges of Pre-clustered Network-Based Pathway Analysis
Functional analysis of gene sets derived from experiments is typically done by pathway annotation. Although many algorithms exist for analyzing the association between a gene set and a pathway, an issue which is generally ignored is that gene sets often represent multiple pathways. In such cases an...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9127507/ https://www.ncbi.nlm.nih.gov/pubmed/35620466 http://dx.doi.org/10.3389/fgene.2022.855766 |
_version_ | 1784712368236789760 |
---|---|
author | Castresana-Aguirre, Miguel Guala, Dimitri Sonnhammer, Erik L. L. |
author_facet | Castresana-Aguirre, Miguel Guala, Dimitri Sonnhammer, Erik L. L. |
author_sort | Castresana-Aguirre, Miguel |
collection | PubMed |
description | Functional analysis of gene sets derived from experiments is typically done by pathway annotation. Although many algorithms exist for analyzing the association between a gene set and a pathway, an issue which is generally ignored is that gene sets often represent multiple pathways. In such cases an association to a pathway is weakened by the presence of genes associated with other pathways. A way to counteract this is to cluster the gene set into more homogenous parts before performing pathway analysis on each module. We explored whether network-based pre-clustering of a query gene set can improve pathway analysis. The methods MCL, Infomap, and MGclus were used to cluster the gene set projected onto the FunCoup network. We characterized how well these methods are able to detect individual pathways in multi-pathway gene sets, and applied each of the clustering methods in combination with four pathway analysis methods: Gene Enrichment Analysis, BinoX, NEAT, and ANUBIX. Using benchmarks constructed from the KEGG pathway database we found that clustering can be beneficial by increasing the sensitivity of pathway analysis methods and by providing deeper insights of biological mechanisms related to the phenotype under study. However, keeping a high specificity is a challenge. For ANUBIX, clustering caused a minor loss of specificity, while for BinoX and NEAT it caused an unacceptable loss of specificity. GEA had very low sensitivity both before and after clustering. The choice of clustering method only had a minor effect on the results. We show examples of this approach and conclude that clustering can improve overall pathway annotation performance, but should only be used if the used enrichment method has a low false positive rate. |
format | Online Article Text |
id | pubmed-9127507 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-91275072022-05-25 Benefits and Challenges of Pre-clustered Network-Based Pathway Analysis Castresana-Aguirre, Miguel Guala, Dimitri Sonnhammer, Erik L. L. Front Genet Genetics Functional analysis of gene sets derived from experiments is typically done by pathway annotation. Although many algorithms exist for analyzing the association between a gene set and a pathway, an issue which is generally ignored is that gene sets often represent multiple pathways. In such cases an association to a pathway is weakened by the presence of genes associated with other pathways. A way to counteract this is to cluster the gene set into more homogenous parts before performing pathway analysis on each module. We explored whether network-based pre-clustering of a query gene set can improve pathway analysis. The methods MCL, Infomap, and MGclus were used to cluster the gene set projected onto the FunCoup network. We characterized how well these methods are able to detect individual pathways in multi-pathway gene sets, and applied each of the clustering methods in combination with four pathway analysis methods: Gene Enrichment Analysis, BinoX, NEAT, and ANUBIX. Using benchmarks constructed from the KEGG pathway database we found that clustering can be beneficial by increasing the sensitivity of pathway analysis methods and by providing deeper insights of biological mechanisms related to the phenotype under study. However, keeping a high specificity is a challenge. For ANUBIX, clustering caused a minor loss of specificity, while for BinoX and NEAT it caused an unacceptable loss of specificity. GEA had very low sensitivity both before and after clustering. The choice of clustering method only had a minor effect on the results. We show examples of this approach and conclude that clustering can improve overall pathway annotation performance, but should only be used if the used enrichment method has a low false positive rate. Frontiers Media S.A. 2022-05-10 /pmc/articles/PMC9127507/ /pubmed/35620466 http://dx.doi.org/10.3389/fgene.2022.855766 Text en Copyright © 2022 Castresana-Aguirre, Guala and Sonnhammer. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Genetics Castresana-Aguirre, Miguel Guala, Dimitri Sonnhammer, Erik L. L. Benefits and Challenges of Pre-clustered Network-Based Pathway Analysis |
title | Benefits and Challenges of Pre-clustered Network-Based Pathway Analysis |
title_full | Benefits and Challenges of Pre-clustered Network-Based Pathway Analysis |
title_fullStr | Benefits and Challenges of Pre-clustered Network-Based Pathway Analysis |
title_full_unstemmed | Benefits and Challenges of Pre-clustered Network-Based Pathway Analysis |
title_short | Benefits and Challenges of Pre-clustered Network-Based Pathway Analysis |
title_sort | benefits and challenges of pre-clustered network-based pathway analysis |
topic | Genetics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9127507/ https://www.ncbi.nlm.nih.gov/pubmed/35620466 http://dx.doi.org/10.3389/fgene.2022.855766 |
work_keys_str_mv | AT castresanaaguirremiguel benefitsandchallengesofpreclusterednetworkbasedpathwayanalysis AT gualadimitri benefitsandchallengesofpreclusterednetworkbasedpathwayanalysis AT sonnhammererikll benefitsandchallengesofpreclusterednetworkbasedpathwayanalysis |