Cargando…

Improved biomarker discovery through a plot twist in transcriptomic data analysis

BACKGROUND: Transcriptomic analysis is crucial for understanding the functional elements of the genome, with the classic method consisting of screening transcriptomics datasets for differentially expressed genes (DEGs). Additionally, since 2005, weighted gene co-expression network analysis (WGCNA) h...

Descripción completa

Detalles Bibliográficos
Autores principales: Sánchez-Baizán, Núria, Ribas, Laia, Piferrer, Francesc
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9509653/
https://www.ncbi.nlm.nih.gov/pubmed/36153614
http://dx.doi.org/10.1186/s12915-022-01398-w
_version_ 1784797275542781952
author Sánchez-Baizán, Núria
Ribas, Laia
Piferrer, Francesc
author_facet Sánchez-Baizán, Núria
Ribas, Laia
Piferrer, Francesc
author_sort Sánchez-Baizán, Núria
collection PubMed
description BACKGROUND: Transcriptomic analysis is crucial for understanding the functional elements of the genome, with the classic method consisting of screening transcriptomics datasets for differentially expressed genes (DEGs). Additionally, since 2005, weighted gene co-expression network analysis (WGCNA) has emerged as a powerful method to explore relationships between genes. However, an approach combining both methods, i.e., filtering the transcriptome dataset by DEGs or other criteria, followed by WGCNA (DEGs + WGCNA), has become common. This is of concern because such approach can affect the resulting underlying architecture of the network under analysis and lead to wrong conclusions. Here, we explore a plot twist to transcriptome data analysis: applying WGCNA to exploit entire datasets without affecting the topology of the network, followed with the strength and relative simplicity of DEG analysis (WGCNA + DEGs). We tested WGCNA + DEGs against DEGs + WGCNA to publicly available transcriptomics data in one of the most transcriptomically complex tissues and delicate processes: vertebrate gonads undergoing sex differentiation. We further validate the general applicability of our approach through analysis of datasets from three distinct model systems: European sea bass, mouse, and human. RESULTS: In all cases, WGCNA + DEGs clearly outperformed DEGs + WGCNA. First, the network model fit and node connectivity measures and other network statistics improved. The gene lists filtered by each method were different, the number of modules associated with the trait of interest and key genes retained increased, and GO terms of biological processes provided a more nuanced representation of the biological question under consideration. Lastly, WGCNA + DEGs facilitated biomarker discovery. CONCLUSIONS: We propose that building a co-expression network from an entire dataset, and only thereafter filtering by DEGs, should be the method to use in transcriptomic studies, regardless of biological system, species, or question being considered. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12915-022-01398-w.
format Online
Article
Text
id pubmed-9509653
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-95096532022-09-26 Improved biomarker discovery through a plot twist in transcriptomic data analysis Sánchez-Baizán, Núria Ribas, Laia Piferrer, Francesc BMC Biol Research Article BACKGROUND: Transcriptomic analysis is crucial for understanding the functional elements of the genome, with the classic method consisting of screening transcriptomics datasets for differentially expressed genes (DEGs). Additionally, since 2005, weighted gene co-expression network analysis (WGCNA) has emerged as a powerful method to explore relationships between genes. However, an approach combining both methods, i.e., filtering the transcriptome dataset by DEGs or other criteria, followed by WGCNA (DEGs + WGCNA), has become common. This is of concern because such approach can affect the resulting underlying architecture of the network under analysis and lead to wrong conclusions. Here, we explore a plot twist to transcriptome data analysis: applying WGCNA to exploit entire datasets without affecting the topology of the network, followed with the strength and relative simplicity of DEG analysis (WGCNA + DEGs). We tested WGCNA + DEGs against DEGs + WGCNA to publicly available transcriptomics data in one of the most transcriptomically complex tissues and delicate processes: vertebrate gonads undergoing sex differentiation. We further validate the general applicability of our approach through analysis of datasets from three distinct model systems: European sea bass, mouse, and human. RESULTS: In all cases, WGCNA + DEGs clearly outperformed DEGs + WGCNA. First, the network model fit and node connectivity measures and other network statistics improved. The gene lists filtered by each method were different, the number of modules associated with the trait of interest and key genes retained increased, and GO terms of biological processes provided a more nuanced representation of the biological question under consideration. Lastly, WGCNA + DEGs facilitated biomarker discovery. CONCLUSIONS: We propose that building a co-expression network from an entire dataset, and only thereafter filtering by DEGs, should be the method to use in transcriptomic studies, regardless of biological system, species, or question being considered. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12915-022-01398-w. BioMed Central 2022-09-24 /pmc/articles/PMC9509653/ /pubmed/36153614 http://dx.doi.org/10.1186/s12915-022-01398-w Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research Article
Sánchez-Baizán, Núria
Ribas, Laia
Piferrer, Francesc
Improved biomarker discovery through a plot twist in transcriptomic data analysis
title Improved biomarker discovery through a plot twist in transcriptomic data analysis
title_full Improved biomarker discovery through a plot twist in transcriptomic data analysis
title_fullStr Improved biomarker discovery through a plot twist in transcriptomic data analysis
title_full_unstemmed Improved biomarker discovery through a plot twist in transcriptomic data analysis
title_short Improved biomarker discovery through a plot twist in transcriptomic data analysis
title_sort improved biomarker discovery through a plot twist in transcriptomic data analysis
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9509653/
https://www.ncbi.nlm.nih.gov/pubmed/36153614
http://dx.doi.org/10.1186/s12915-022-01398-w
work_keys_str_mv AT sanchezbaizannuria improvedbiomarkerdiscoverythroughaplottwistintranscriptomicdataanalysis
AT ribaslaia improvedbiomarkerdiscoverythroughaplottwistintranscriptomicdataanalysis
AT piferrerfrancesc improvedbiomarkerdiscoverythroughaplottwistintranscriptomicdataanalysis