Cargando…
A mixture model to detect edges in sparse co-expression graphs with an application for comparing breast cancer subtypes
We develop a method to recover a gene network’s structure from co-expression data, measured in terms of normalized Pearson’s correlation coefficients between gene pairs. We treat these co-expression measurements as weights in the complete graph in which nodes correspond to genes. To decide which edg...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7877669/ https://www.ncbi.nlm.nih.gov/pubmed/33571253 http://dx.doi.org/10.1371/journal.pone.0246945 |
_version_ | 1783650217616211968 |
---|---|
author | Bar, Haim Bang, Seojin |
author_facet | Bar, Haim Bang, Seojin |
author_sort | Bar, Haim |
collection | PubMed |
description | We develop a method to recover a gene network’s structure from co-expression data, measured in terms of normalized Pearson’s correlation coefficients between gene pairs. We treat these co-expression measurements as weights in the complete graph in which nodes correspond to genes. To decide which edges exist in the gene network, we fit a three-component mixture model such that the observed weights of ‘null edges’ follow a normal distribution with mean 0, and the non-null edges follow a mixture of two lognormal distributions, one for positively- and one for negatively-correlated pairs. We show that this so-called L(2) N mixture model outperforms other methods in terms of power to detect edges, and it allows to control the false discovery rate. Importantly, our method makes no assumptions about the true network structure. We demonstrate our method, which is implemented in an R package called edgefinder, using a large dataset consisting of expression values of 12,750 genes obtained from 1,616 women. We infer the gene network structure by cancer subtype, and find insightful subtype characteristics. For example, we find thirteen pathways which are enriched in each of the cancer groups but not in the Normal group, with two of the pathways associated with autoimmune diseases and two other with graft rejection. We also find specific characteristics of different breast cancer subtypes. For example, the Luminal A network includes a single, highly connected cluster of genes, which is enriched in the human diseases category, and in the Her2 subtype network we find a distinct, and highly interconnected cluster which is uniquely enriched in drug metabolism pathways. |
format | Online Article Text |
id | pubmed-7877669 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-78776692021-02-19 A mixture model to detect edges in sparse co-expression graphs with an application for comparing breast cancer subtypes Bar, Haim Bang, Seojin PLoS One Research Article We develop a method to recover a gene network’s structure from co-expression data, measured in terms of normalized Pearson’s correlation coefficients between gene pairs. We treat these co-expression measurements as weights in the complete graph in which nodes correspond to genes. To decide which edges exist in the gene network, we fit a three-component mixture model such that the observed weights of ‘null edges’ follow a normal distribution with mean 0, and the non-null edges follow a mixture of two lognormal distributions, one for positively- and one for negatively-correlated pairs. We show that this so-called L(2) N mixture model outperforms other methods in terms of power to detect edges, and it allows to control the false discovery rate. Importantly, our method makes no assumptions about the true network structure. We demonstrate our method, which is implemented in an R package called edgefinder, using a large dataset consisting of expression values of 12,750 genes obtained from 1,616 women. We infer the gene network structure by cancer subtype, and find insightful subtype characteristics. For example, we find thirteen pathways which are enriched in each of the cancer groups but not in the Normal group, with two of the pathways associated with autoimmune diseases and two other with graft rejection. We also find specific characteristics of different breast cancer subtypes. For example, the Luminal A network includes a single, highly connected cluster of genes, which is enriched in the human diseases category, and in the Her2 subtype network we find a distinct, and highly interconnected cluster which is uniquely enriched in drug metabolism pathways. Public Library of Science 2021-02-11 /pmc/articles/PMC7877669/ /pubmed/33571253 http://dx.doi.org/10.1371/journal.pone.0246945 Text en © 2021 Bar, Bang http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Bar, Haim Bang, Seojin A mixture model to detect edges in sparse co-expression graphs with an application for comparing breast cancer subtypes |
title | A mixture model to detect edges in sparse co-expression graphs with an application for comparing breast cancer subtypes |
title_full | A mixture model to detect edges in sparse co-expression graphs with an application for comparing breast cancer subtypes |
title_fullStr | A mixture model to detect edges in sparse co-expression graphs with an application for comparing breast cancer subtypes |
title_full_unstemmed | A mixture model to detect edges in sparse co-expression graphs with an application for comparing breast cancer subtypes |
title_short | A mixture model to detect edges in sparse co-expression graphs with an application for comparing breast cancer subtypes |
title_sort | mixture model to detect edges in sparse co-expression graphs with an application for comparing breast cancer subtypes |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7877669/ https://www.ncbi.nlm.nih.gov/pubmed/33571253 http://dx.doi.org/10.1371/journal.pone.0246945 |
work_keys_str_mv | AT barhaim amixturemodeltodetectedgesinsparsecoexpressiongraphswithanapplicationforcomparingbreastcancersubtypes AT bangseojin amixturemodeltodetectedgesinsparsecoexpressiongraphswithanapplicationforcomparingbreastcancersubtypes AT barhaim mixturemodeltodetectedgesinsparsecoexpressiongraphswithanapplicationforcomparingbreastcancersubtypes AT bangseojin mixturemodeltodetectedgesinsparsecoexpressiongraphswithanapplicationforcomparingbreastcancersubtypes |