Cargando…

Recursive Indirect-Paths Modularity (RIP-M) for Detecting Community Structure in RNA-Seq Co-expression Networks

Clusters of genes in co-expression networks are commonly used as functional units for gene set enrichment detection and increasingly as features (attribute construction) for statistical inference and sample classification. One of the practical challenges of clustering for these purposes is to identi...

Descripción completa

Detalles Bibliográficos
Autores principales: Rahmani, Bahareh, Zimmermann, Michael T., Grill, Diane E., Kennedy, Richard B., Oberg, Ann L., White, Bill C., Poland, Gregory A., McKinney, Brett A.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4861003/
https://www.ncbi.nlm.nih.gov/pubmed/27242890
http://dx.doi.org/10.3389/fgene.2016.00080
_version_ 1782431159646420992
author Rahmani, Bahareh
Zimmermann, Michael T.
Grill, Diane E.
Kennedy, Richard B.
Oberg, Ann L.
White, Bill C.
Poland, Gregory A.
McKinney, Brett A.
author_facet Rahmani, Bahareh
Zimmermann, Michael T.
Grill, Diane E.
Kennedy, Richard B.
Oberg, Ann L.
White, Bill C.
Poland, Gregory A.
McKinney, Brett A.
author_sort Rahmani, Bahareh
collection PubMed
description Clusters of genes in co-expression networks are commonly used as functional units for gene set enrichment detection and increasingly as features (attribute construction) for statistical inference and sample classification. One of the practical challenges of clustering for these purposes is to identify an optimal partition of the network where the individual clusters are neither too large, prohibiting interpretation, nor too small, precluding general inference. Newman Modularity is a spectral clustering algorithm that automatically finds the number of clusters, but for many biological networks the cluster sizes are suboptimal. In this work, we generalize Newman Modularity to incorporate information from indirect paths in RNA-Seq co-expression networks. We implement a merge-and-split algorithm that allows the user to constrain the range of cluster sizes: large enough to capture genes in relevant pathways, yet small enough to resolve distinct functions. We investigate the properties of our recursive indirect-pathways modularity (RIP-M) and compare it with other clustering methods using simulated co-expression networks and RNA-seq data from an influenza vaccine response study. RIP-M had higher cluster assignment accuracy than Newman Modularity for finding clusters in simulated co-expression networks for all scenarios, and RIP-M had comparable accuracy to Weighted Gene Correlation Network Analysis (WGCNA). RIP-M was more accurate than WGCNA for modest hard thresholds and comparable for high, while WGCNA was slightly more accurate for soft thresholds. In the vaccine study data, RIP-M and WGCNA enriched for a comparable number of immunologically relevant pathways.
format Online
Article
Text
id pubmed-4861003
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-48610032016-05-30 Recursive Indirect-Paths Modularity (RIP-M) for Detecting Community Structure in RNA-Seq Co-expression Networks Rahmani, Bahareh Zimmermann, Michael T. Grill, Diane E. Kennedy, Richard B. Oberg, Ann L. White, Bill C. Poland, Gregory A. McKinney, Brett A. Front Genet Genetics Clusters of genes in co-expression networks are commonly used as functional units for gene set enrichment detection and increasingly as features (attribute construction) for statistical inference and sample classification. One of the practical challenges of clustering for these purposes is to identify an optimal partition of the network where the individual clusters are neither too large, prohibiting interpretation, nor too small, precluding general inference. Newman Modularity is a spectral clustering algorithm that automatically finds the number of clusters, but for many biological networks the cluster sizes are suboptimal. In this work, we generalize Newman Modularity to incorporate information from indirect paths in RNA-Seq co-expression networks. We implement a merge-and-split algorithm that allows the user to constrain the range of cluster sizes: large enough to capture genes in relevant pathways, yet small enough to resolve distinct functions. We investigate the properties of our recursive indirect-pathways modularity (RIP-M) and compare it with other clustering methods using simulated co-expression networks and RNA-seq data from an influenza vaccine response study. RIP-M had higher cluster assignment accuracy than Newman Modularity for finding clusters in simulated co-expression networks for all scenarios, and RIP-M had comparable accuracy to Weighted Gene Correlation Network Analysis (WGCNA). RIP-M was more accurate than WGCNA for modest hard thresholds and comparable for high, while WGCNA was slightly more accurate for soft thresholds. In the vaccine study data, RIP-M and WGCNA enriched for a comparable number of immunologically relevant pathways. Frontiers Media S.A. 2016-05-09 /pmc/articles/PMC4861003/ /pubmed/27242890 http://dx.doi.org/10.3389/fgene.2016.00080 Text en Copyright © 2016 Rahmani, Zimmermann, Grill, Kennedy, Oberg, White, Poland and McKinney. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Rahmani, Bahareh
Zimmermann, Michael T.
Grill, Diane E.
Kennedy, Richard B.
Oberg, Ann L.
White, Bill C.
Poland, Gregory A.
McKinney, Brett A.
Recursive Indirect-Paths Modularity (RIP-M) for Detecting Community Structure in RNA-Seq Co-expression Networks
title Recursive Indirect-Paths Modularity (RIP-M) for Detecting Community Structure in RNA-Seq Co-expression Networks
title_full Recursive Indirect-Paths Modularity (RIP-M) for Detecting Community Structure in RNA-Seq Co-expression Networks
title_fullStr Recursive Indirect-Paths Modularity (RIP-M) for Detecting Community Structure in RNA-Seq Co-expression Networks
title_full_unstemmed Recursive Indirect-Paths Modularity (RIP-M) for Detecting Community Structure in RNA-Seq Co-expression Networks
title_short Recursive Indirect-Paths Modularity (RIP-M) for Detecting Community Structure in RNA-Seq Co-expression Networks
title_sort recursive indirect-paths modularity (rip-m) for detecting community structure in rna-seq co-expression networks
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4861003/
https://www.ncbi.nlm.nih.gov/pubmed/27242890
http://dx.doi.org/10.3389/fgene.2016.00080
work_keys_str_mv AT rahmanibahareh recursiveindirectpathsmodularityripmfordetectingcommunitystructureinrnaseqcoexpressionnetworks
AT zimmermannmichaelt recursiveindirectpathsmodularityripmfordetectingcommunitystructureinrnaseqcoexpressionnetworks
AT grilldianee recursiveindirectpathsmodularityripmfordetectingcommunitystructureinrnaseqcoexpressionnetworks
AT kennedyrichardb recursiveindirectpathsmodularityripmfordetectingcommunitystructureinrnaseqcoexpressionnetworks
AT obergannl recursiveindirectpathsmodularityripmfordetectingcommunitystructureinrnaseqcoexpressionnetworks
AT whitebillc recursiveindirectpathsmodularityripmfordetectingcommunitystructureinrnaseqcoexpressionnetworks
AT polandgregorya recursiveindirectpathsmodularityripmfordetectingcommunitystructureinrnaseqcoexpressionnetworks
AT mckinneybretta recursiveindirectpathsmodularityripmfordetectingcommunitystructureinrnaseqcoexpressionnetworks