Cargando…
Improved biomarker discovery through a plot twist in transcriptomic data analysis
BACKGROUND: Transcriptomic analysis is crucial for understanding the functional elements of the genome, with the classic method consisting of screening transcriptomics datasets for differentially expressed genes (DEGs). Additionally, since 2005, weighted gene co-expression network analysis (WGCNA) h...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9509653/ https://www.ncbi.nlm.nih.gov/pubmed/36153614 http://dx.doi.org/10.1186/s12915-022-01398-w |
_version_ | 1784797275542781952 |
---|---|
author | Sánchez-Baizán, Núria Ribas, Laia Piferrer, Francesc |
author_facet | Sánchez-Baizán, Núria Ribas, Laia Piferrer, Francesc |
author_sort | Sánchez-Baizán, Núria |
collection | PubMed |
description | BACKGROUND: Transcriptomic analysis is crucial for understanding the functional elements of the genome, with the classic method consisting of screening transcriptomics datasets for differentially expressed genes (DEGs). Additionally, since 2005, weighted gene co-expression network analysis (WGCNA) has emerged as a powerful method to explore relationships between genes. However, an approach combining both methods, i.e., filtering the transcriptome dataset by DEGs or other criteria, followed by WGCNA (DEGs + WGCNA), has become common. This is of concern because such approach can affect the resulting underlying architecture of the network under analysis and lead to wrong conclusions. Here, we explore a plot twist to transcriptome data analysis: applying WGCNA to exploit entire datasets without affecting the topology of the network, followed with the strength and relative simplicity of DEG analysis (WGCNA + DEGs). We tested WGCNA + DEGs against DEGs + WGCNA to publicly available transcriptomics data in one of the most transcriptomically complex tissues and delicate processes: vertebrate gonads undergoing sex differentiation. We further validate the general applicability of our approach through analysis of datasets from three distinct model systems: European sea bass, mouse, and human. RESULTS: In all cases, WGCNA + DEGs clearly outperformed DEGs + WGCNA. First, the network model fit and node connectivity measures and other network statistics improved. The gene lists filtered by each method were different, the number of modules associated with the trait of interest and key genes retained increased, and GO terms of biological processes provided a more nuanced representation of the biological question under consideration. Lastly, WGCNA + DEGs facilitated biomarker discovery. CONCLUSIONS: We propose that building a co-expression network from an entire dataset, and only thereafter filtering by DEGs, should be the method to use in transcriptomic studies, regardless of biological system, species, or question being considered. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12915-022-01398-w. |
format | Online Article Text |
id | pubmed-9509653 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-95096532022-09-26 Improved biomarker discovery through a plot twist in transcriptomic data analysis Sánchez-Baizán, Núria Ribas, Laia Piferrer, Francesc BMC Biol Research Article BACKGROUND: Transcriptomic analysis is crucial for understanding the functional elements of the genome, with the classic method consisting of screening transcriptomics datasets for differentially expressed genes (DEGs). Additionally, since 2005, weighted gene co-expression network analysis (WGCNA) has emerged as a powerful method to explore relationships between genes. However, an approach combining both methods, i.e., filtering the transcriptome dataset by DEGs or other criteria, followed by WGCNA (DEGs + WGCNA), has become common. This is of concern because such approach can affect the resulting underlying architecture of the network under analysis and lead to wrong conclusions. Here, we explore a plot twist to transcriptome data analysis: applying WGCNA to exploit entire datasets without affecting the topology of the network, followed with the strength and relative simplicity of DEG analysis (WGCNA + DEGs). We tested WGCNA + DEGs against DEGs + WGCNA to publicly available transcriptomics data in one of the most transcriptomically complex tissues and delicate processes: vertebrate gonads undergoing sex differentiation. We further validate the general applicability of our approach through analysis of datasets from three distinct model systems: European sea bass, mouse, and human. RESULTS: In all cases, WGCNA + DEGs clearly outperformed DEGs + WGCNA. First, the network model fit and node connectivity measures and other network statistics improved. The gene lists filtered by each method were different, the number of modules associated with the trait of interest and key genes retained increased, and GO terms of biological processes provided a more nuanced representation of the biological question under consideration. Lastly, WGCNA + DEGs facilitated biomarker discovery. CONCLUSIONS: We propose that building a co-expression network from an entire dataset, and only thereafter filtering by DEGs, should be the method to use in transcriptomic studies, regardless of biological system, species, or question being considered. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12915-022-01398-w. BioMed Central 2022-09-24 /pmc/articles/PMC9509653/ /pubmed/36153614 http://dx.doi.org/10.1186/s12915-022-01398-w Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Article Sánchez-Baizán, Núria Ribas, Laia Piferrer, Francesc Improved biomarker discovery through a plot twist in transcriptomic data analysis |
title | Improved biomarker discovery through a plot twist in transcriptomic data analysis |
title_full | Improved biomarker discovery through a plot twist in transcriptomic data analysis |
title_fullStr | Improved biomarker discovery through a plot twist in transcriptomic data analysis |
title_full_unstemmed | Improved biomarker discovery through a plot twist in transcriptomic data analysis |
title_short | Improved biomarker discovery through a plot twist in transcriptomic data analysis |
title_sort | improved biomarker discovery through a plot twist in transcriptomic data analysis |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9509653/ https://www.ncbi.nlm.nih.gov/pubmed/36153614 http://dx.doi.org/10.1186/s12915-022-01398-w |
work_keys_str_mv | AT sanchezbaizannuria improvedbiomarkerdiscoverythroughaplottwistintranscriptomicdataanalysis AT ribaslaia improvedbiomarkerdiscoverythroughaplottwistintranscriptomicdataanalysis AT piferrerfrancesc improvedbiomarkerdiscoverythroughaplottwistintranscriptomicdataanalysis |