Cargando…

Normalization of gene expression data revisited: the three viewpoints of the transcriptome in human skeletal muscle undergoing load-induced hypertrophy and why they matter

BACKGROUND: The biological relevance and accuracy of gene expression data depend on the adequacy of data normalization. This is both due to its role in resolving and accounting for technical variation and errors, and its defining role in shaping the viewpoint of biological interpretations. Still, th...

Descripción completa

Detalles Bibliográficos
Autores principales:	Khan, Yusuf, Hammarström, Daniel, Ellefsen, Stian, Ahmad, Rafi
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2022
Materias:	Research
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9206305/ https://www.ncbi.nlm.nih.gov/pubmed/35717158 http://dx.doi.org/10.1186/s12859-022-04791-y

_version_	1784729310414766080
author	Khan, Yusuf Hammarström, Daniel Ellefsen, Stian Ahmad, Rafi
author_facet	Khan, Yusuf Hammarström, Daniel Ellefsen, Stian Ahmad, Rafi
author_sort	Khan, Yusuf
collection	PubMed
description	BACKGROUND: The biological relevance and accuracy of gene expression data depend on the adequacy of data normalization. This is both due to its role in resolving and accounting for technical variation and errors, and its defining role in shaping the viewpoint of biological interpretations. Still, the choice of the normalization method is often not explicitly motivated although this choice may be particularly decisive for conclusions in studies involving pronounced cellular plasticity. In this study, we highlight the consequences of using three fundamentally different modes of normalization for interpreting RNA-seq data from human skeletal muscle undergoing exercise-training-induced growth. Briefly, 25 participants conducted 12 weeks of high-load resistance training. Muscle biopsy specimens were sampled from m. vastus lateralis before, after two weeks of training (week 2) and after the intervention (week 12), and were subsequently analyzed using RNA-seq. Transcript counts were modeled as (1) per-library-size, (2) per-total-RNA, and (3) per-sample-size (per-mg-tissue). RESULT: Initially, the three modes of transcript modeling led to the identification of three unique sets of stable genes, which displayed differential expression profiles. Specifically, genes showing stable expression across samples in the per-library-size dataset displayed training-associated increases in per-total-RNA and per-sample-size datasets. These gene sets were then used for normalization of the entire dataset, providing transcript abundance estimates corresponding to each of the three biological viewpoints (i.e., per-library-size, per-total-RNA, and per-sample-size). The different normalization modes led to different conclusions, measured as training-associated changes in transcript expression. Briefly, for 27% and 20% of the transcripts, training was associated with changes in expression in per-total-RNA and per-sample-size scenarios, but not in the per-library-size scenario. At week 2, this led to opposite conclusions for 4% of the transcripts between per-library-size and per-sample-size datasets (↑ vs. ↓, respectively). CONCLUSION: Scientists should be explicit with their choice of normalization strategies and should interpret the results of gene expression analyses with caution. This is particularly important for data sets involving a limited number of genes or involving growing or differentiating cellular models, where the risk of biased conclusions is pronounced.
format	Online Article Text
id	pubmed-9206305
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-92063052022-06-19 Normalization of gene expression data revisited: the three viewpoints of the transcriptome in human skeletal muscle undergoing load-induced hypertrophy and why they matter Khan, Yusuf Hammarström, Daniel Ellefsen, Stian Ahmad, Rafi BMC Bioinformatics Research BACKGROUND: The biological relevance and accuracy of gene expression data depend on the adequacy of data normalization. This is both due to its role in resolving and accounting for technical variation and errors, and its defining role in shaping the viewpoint of biological interpretations. Still, the choice of the normalization method is often not explicitly motivated although this choice may be particularly decisive for conclusions in studies involving pronounced cellular plasticity. In this study, we highlight the consequences of using three fundamentally different modes of normalization for interpreting RNA-seq data from human skeletal muscle undergoing exercise-training-induced growth. Briefly, 25 participants conducted 12 weeks of high-load resistance training. Muscle biopsy specimens were sampled from m. vastus lateralis before, after two weeks of training (week 2) and after the intervention (week 12), and were subsequently analyzed using RNA-seq. Transcript counts were modeled as (1) per-library-size, (2) per-total-RNA, and (3) per-sample-size (per-mg-tissue). RESULT: Initially, the three modes of transcript modeling led to the identification of three unique sets of stable genes, which displayed differential expression profiles. Specifically, genes showing stable expression across samples in the per-library-size dataset displayed training-associated increases in per-total-RNA and per-sample-size datasets. These gene sets were then used for normalization of the entire dataset, providing transcript abundance estimates corresponding to each of the three biological viewpoints (i.e., per-library-size, per-total-RNA, and per-sample-size). The different normalization modes led to different conclusions, measured as training-associated changes in transcript expression. Briefly, for 27% and 20% of the transcripts, training was associated with changes in expression in per-total-RNA and per-sample-size scenarios, but not in the per-library-size scenario. At week 2, this led to opposite conclusions for 4% of the transcripts between per-library-size and per-sample-size datasets (↑ vs. ↓, respectively). CONCLUSION: Scientists should be explicit with their choice of normalization strategies and should interpret the results of gene expression analyses with caution. This is particularly important for data sets involving a limited number of genes or involving growing or differentiating cellular models, where the risk of biased conclusions is pronounced. BioMed Central 2022-06-18 /pmc/articles/PMC9206305/ /pubmed/35717158 http://dx.doi.org/10.1186/s12859-022-04791-y Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle	Research Khan, Yusuf Hammarström, Daniel Ellefsen, Stian Ahmad, Rafi Normalization of gene expression data revisited: the three viewpoints of the transcriptome in human skeletal muscle undergoing load-induced hypertrophy and why they matter
title	Normalization of gene expression data revisited: the three viewpoints of the transcriptome in human skeletal muscle undergoing load-induced hypertrophy and why they matter
title_full	Normalization of gene expression data revisited: the three viewpoints of the transcriptome in human skeletal muscle undergoing load-induced hypertrophy and why they matter
title_fullStr	Normalization of gene expression data revisited: the three viewpoints of the transcriptome in human skeletal muscle undergoing load-induced hypertrophy and why they matter
title_full_unstemmed	Normalization of gene expression data revisited: the three viewpoints of the transcriptome in human skeletal muscle undergoing load-induced hypertrophy and why they matter
title_short	Normalization of gene expression data revisited: the three viewpoints of the transcriptome in human skeletal muscle undergoing load-induced hypertrophy and why they matter
title_sort	normalization of gene expression data revisited: the three viewpoints of the transcriptome in human skeletal muscle undergoing load-induced hypertrophy and why they matter
topic	Research
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9206305/ https://www.ncbi.nlm.nih.gov/pubmed/35717158 http://dx.doi.org/10.1186/s12859-022-04791-y
work_keys_str_mv	AT khanyusuf normalizationofgeneexpressiondatarevisitedthethreeviewpointsofthetranscriptomeinhumanskeletalmuscleundergoingloadinducedhypertrophyandwhytheymatter AT hammarstromdaniel normalizationofgeneexpressiondatarevisitedthethreeviewpointsofthetranscriptomeinhumanskeletalmuscleundergoingloadinducedhypertrophyandwhytheymatter AT ellefsenstian normalizationofgeneexpressiondatarevisitedthethreeviewpointsofthetranscriptomeinhumanskeletalmuscleundergoingloadinducedhypertrophyandwhytheymatter AT ahmadrafi normalizationofgeneexpressiondatarevisitedthethreeviewpointsofthetranscriptomeinhumanskeletalmuscleundergoingloadinducedhypertrophyandwhytheymatter

Normalization of gene expression data revisited: the three viewpoints of the transcriptome in human skeletal muscle undergoing load-induced hypertrophy and why they matter

Ejemplares similares