Cargando…

A mixture model to detect edges in sparse co-expression graphs with an application for comparing breast cancer subtypes

We develop a method to recover a gene network’s structure from co-expression data, measured in terms of normalized Pearson’s correlation coefficients between gene pairs. We treat these co-expression measurements as weights in the complete graph in which nodes correspond to genes. To decide which edg...

Descripción completa

Detalles Bibliográficos
Autores principales: Bar, Haim, Bang, Seojin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7877669/
https://www.ncbi.nlm.nih.gov/pubmed/33571253
http://dx.doi.org/10.1371/journal.pone.0246945
_version_ 1783650217616211968
author Bar, Haim
Bang, Seojin
author_facet Bar, Haim
Bang, Seojin
author_sort Bar, Haim
collection PubMed
description We develop a method to recover a gene network’s structure from co-expression data, measured in terms of normalized Pearson’s correlation coefficients between gene pairs. We treat these co-expression measurements as weights in the complete graph in which nodes correspond to genes. To decide which edges exist in the gene network, we fit a three-component mixture model such that the observed weights of ‘null edges’ follow a normal distribution with mean 0, and the non-null edges follow a mixture of two lognormal distributions, one for positively- and one for negatively-correlated pairs. We show that this so-called L(2) N mixture model outperforms other methods in terms of power to detect edges, and it allows to control the false discovery rate. Importantly, our method makes no assumptions about the true network structure. We demonstrate our method, which is implemented in an R package called edgefinder, using a large dataset consisting of expression values of 12,750 genes obtained from 1,616 women. We infer the gene network structure by cancer subtype, and find insightful subtype characteristics. For example, we find thirteen pathways which are enriched in each of the cancer groups but not in the Normal group, with two of the pathways associated with autoimmune diseases and two other with graft rejection. We also find specific characteristics of different breast cancer subtypes. For example, the Luminal A network includes a single, highly connected cluster of genes, which is enriched in the human diseases category, and in the Her2 subtype network we find a distinct, and highly interconnected cluster which is uniquely enriched in drug metabolism pathways.
format Online
Article
Text
id pubmed-7877669
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-78776692021-02-19 A mixture model to detect edges in sparse co-expression graphs with an application for comparing breast cancer subtypes Bar, Haim Bang, Seojin PLoS One Research Article We develop a method to recover a gene network’s structure from co-expression data, measured in terms of normalized Pearson’s correlation coefficients between gene pairs. We treat these co-expression measurements as weights in the complete graph in which nodes correspond to genes. To decide which edges exist in the gene network, we fit a three-component mixture model such that the observed weights of ‘null edges’ follow a normal distribution with mean 0, and the non-null edges follow a mixture of two lognormal distributions, one for positively- and one for negatively-correlated pairs. We show that this so-called L(2) N mixture model outperforms other methods in terms of power to detect edges, and it allows to control the false discovery rate. Importantly, our method makes no assumptions about the true network structure. We demonstrate our method, which is implemented in an R package called edgefinder, using a large dataset consisting of expression values of 12,750 genes obtained from 1,616 women. We infer the gene network structure by cancer subtype, and find insightful subtype characteristics. For example, we find thirteen pathways which are enriched in each of the cancer groups but not in the Normal group, with two of the pathways associated with autoimmune diseases and two other with graft rejection. We also find specific characteristics of different breast cancer subtypes. For example, the Luminal A network includes a single, highly connected cluster of genes, which is enriched in the human diseases category, and in the Her2 subtype network we find a distinct, and highly interconnected cluster which is uniquely enriched in drug metabolism pathways. Public Library of Science 2021-02-11 /pmc/articles/PMC7877669/ /pubmed/33571253 http://dx.doi.org/10.1371/journal.pone.0246945 Text en © 2021 Bar, Bang http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Bar, Haim
Bang, Seojin
A mixture model to detect edges in sparse co-expression graphs with an application for comparing breast cancer subtypes
title A mixture model to detect edges in sparse co-expression graphs with an application for comparing breast cancer subtypes
title_full A mixture model to detect edges in sparse co-expression graphs with an application for comparing breast cancer subtypes
title_fullStr A mixture model to detect edges in sparse co-expression graphs with an application for comparing breast cancer subtypes
title_full_unstemmed A mixture model to detect edges in sparse co-expression graphs with an application for comparing breast cancer subtypes
title_short A mixture model to detect edges in sparse co-expression graphs with an application for comparing breast cancer subtypes
title_sort mixture model to detect edges in sparse co-expression graphs with an application for comparing breast cancer subtypes
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7877669/
https://www.ncbi.nlm.nih.gov/pubmed/33571253
http://dx.doi.org/10.1371/journal.pone.0246945
work_keys_str_mv AT barhaim amixturemodeltodetectedgesinsparsecoexpressiongraphswithanapplicationforcomparingbreastcancersubtypes
AT bangseojin amixturemodeltodetectedgesinsparsecoexpressiongraphswithanapplicationforcomparingbreastcancersubtypes
AT barhaim mixturemodeltodetectedgesinsparsecoexpressiongraphswithanapplicationforcomparingbreastcancersubtypes
AT bangseojin mixturemodeltodetectedgesinsparsecoexpressiongraphswithanapplicationforcomparingbreastcancersubtypes