Cargando…

Addressing noise in co-expression network construction

Gene co-expression networks (GCNs) provide multiple benefits to molecular research including hypothesis generation and biomarker discovery. Transcriptome profiles serve as input for GCN construction and are derived from increasingly larger studies with samples across multiple experimental conditions...

Descripción completa

Detalles Bibliográficos
Autores principales: Burns, Joshua J R, Shealy, Benjamin T, Greer, Mitchell S, Hadish, John A, McGowan, Matthew T, Biggs, Tyler, Smith, Melissa C, Feltus, F Alex, Ficklin, Stephen P
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8769892/
https://www.ncbi.nlm.nih.gov/pubmed/34850822
http://dx.doi.org/10.1093/bib/bbab495
_version_ 1784635245099745280
author Burns, Joshua J R
Shealy, Benjamin T
Greer, Mitchell S
Hadish, John A
McGowan, Matthew T
Biggs, Tyler
Smith, Melissa C
Feltus, F Alex
Ficklin, Stephen P
author_facet Burns, Joshua J R
Shealy, Benjamin T
Greer, Mitchell S
Hadish, John A
McGowan, Matthew T
Biggs, Tyler
Smith, Melissa C
Feltus, F Alex
Ficklin, Stephen P
author_sort Burns, Joshua J R
collection PubMed
description Gene co-expression networks (GCNs) provide multiple benefits to molecular research including hypothesis generation and biomarker discovery. Transcriptome profiles serve as input for GCN construction and are derived from increasingly larger studies with samples across multiple experimental conditions, treatments, time points, genotypes, etc. Such experiments with larger numbers of variables confound discovery of true network edges, exclude edges and inhibit discovery of context (or condition) specific network edges. To demonstrate this problem, a 475-sample dataset is used to show that up to 97% of GCN edges can be misleading because correlations are false or incorrect. False and incorrect correlations can occur when tests are applied without ensuring assumptions are met, and pairwise gene expression may not meet test assumptions if the expression of at least one gene in the pairwise comparison is a function of multiple confounding variables. The ‘one-size-fits-all’ approach to GCN construction is therefore problematic for large, multivariable datasets. Recently, the Knowledge Independent Network Construction toolkit has been used in multiple studies to provide a dynamic approach to GCN construction that ensures statistical tests meet assumptions and confounding variables are addressed. Additionally, it can associate experimental context for each edge of the network resulting in context-specific GCNs (csGCNs). To help researchers recognize such challenges in GCN construction, and the creation of csGCNs, we provide a review of the workflow.
format Online
Article
Text
id pubmed-8769892
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-87698922022-01-20 Addressing noise in co-expression network construction Burns, Joshua J R Shealy, Benjamin T Greer, Mitchell S Hadish, John A McGowan, Matthew T Biggs, Tyler Smith, Melissa C Feltus, F Alex Ficklin, Stephen P Brief Bioinform Problem Solving Protocol Gene co-expression networks (GCNs) provide multiple benefits to molecular research including hypothesis generation and biomarker discovery. Transcriptome profiles serve as input for GCN construction and are derived from increasingly larger studies with samples across multiple experimental conditions, treatments, time points, genotypes, etc. Such experiments with larger numbers of variables confound discovery of true network edges, exclude edges and inhibit discovery of context (or condition) specific network edges. To demonstrate this problem, a 475-sample dataset is used to show that up to 97% of GCN edges can be misleading because correlations are false or incorrect. False and incorrect correlations can occur when tests are applied without ensuring assumptions are met, and pairwise gene expression may not meet test assumptions if the expression of at least one gene in the pairwise comparison is a function of multiple confounding variables. The ‘one-size-fits-all’ approach to GCN construction is therefore problematic for large, multivariable datasets. Recently, the Knowledge Independent Network Construction toolkit has been used in multiple studies to provide a dynamic approach to GCN construction that ensures statistical tests meet assumptions and confounding variables are addressed. Additionally, it can associate experimental context for each edge of the network resulting in context-specific GCNs (csGCNs). To help researchers recognize such challenges in GCN construction, and the creation of csGCNs, we provide a review of the workflow. Oxford University Press 2021-11-30 /pmc/articles/PMC8769892/ /pubmed/34850822 http://dx.doi.org/10.1093/bib/bbab495 Text en © The Author(s) 2021. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Problem Solving Protocol
Burns, Joshua J R
Shealy, Benjamin T
Greer, Mitchell S
Hadish, John A
McGowan, Matthew T
Biggs, Tyler
Smith, Melissa C
Feltus, F Alex
Ficklin, Stephen P
Addressing noise in co-expression network construction
title Addressing noise in co-expression network construction
title_full Addressing noise in co-expression network construction
title_fullStr Addressing noise in co-expression network construction
title_full_unstemmed Addressing noise in co-expression network construction
title_short Addressing noise in co-expression network construction
title_sort addressing noise in co-expression network construction
topic Problem Solving Protocol
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8769892/
https://www.ncbi.nlm.nih.gov/pubmed/34850822
http://dx.doi.org/10.1093/bib/bbab495
work_keys_str_mv AT burnsjoshuajr addressingnoiseincoexpressionnetworkconstruction
AT shealybenjamint addressingnoiseincoexpressionnetworkconstruction
AT greermitchells addressingnoiseincoexpressionnetworkconstruction
AT hadishjohna addressingnoiseincoexpressionnetworkconstruction
AT mcgowanmatthewt addressingnoiseincoexpressionnetworkconstruction
AT biggstyler addressingnoiseincoexpressionnetworkconstruction
AT smithmelissac addressingnoiseincoexpressionnetworkconstruction
AT feltusfalex addressingnoiseincoexpressionnetworkconstruction
AT ficklinstephenp addressingnoiseincoexpressionnetworkconstruction