Cargando…
Custom selected reference genes outperform pre-defined reference genes in transcriptomic analysis
BACKGROUND: RNA sequencing allows the measuring of gene expression at a resolution unmet by expression arrays or RT-qPCR. It is however necessary to normalize sequencing data by library size, transcript size and composition, among other factors, before comparing expression levels. The use of interna...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6954607/ https://www.ncbi.nlm.nih.gov/pubmed/31924161 http://dx.doi.org/10.1186/s12864-019-6426-2 |
_version_ | 1783486830747844608 |
---|---|
author | dos Santos, Karen Cristine Gonçalves Desgagné-Penix, Isabel Germain, Hugo |
author_facet | dos Santos, Karen Cristine Gonçalves Desgagné-Penix, Isabel Germain, Hugo |
author_sort | dos Santos, Karen Cristine Gonçalves |
collection | PubMed |
description | BACKGROUND: RNA sequencing allows the measuring of gene expression at a resolution unmet by expression arrays or RT-qPCR. It is however necessary to normalize sequencing data by library size, transcript size and composition, among other factors, before comparing expression levels. The use of internal control genes or spike-ins is advocated in the literature for scaling read counts, but the methods for choosing reference genes are mostly targeted at RT-qPCR studies and require a set of pre-selected candidate controls or pre-selected target genes. RESULTS: Here, we report an R-based pipeline to select internal control genes based solely on read counts and gene sizes. This novel method first normalizes the read counts to Transcripts per Million (TPM) and then excludes weakly expressed genes using the DAFS script to calculate the cut-off. It then selects as references the genes with lowest TPM coefficient of variation. We used this method to pick custom reference genes for the differential expression analysis of three transcriptome sets from transgenic Arabidopsis plants expressing heterologous fungal effector proteins tagged with GFP (using GFP alone as the control). The custom reference genes showed lower coefficient of variation and fold change as well as a broader range of expression levels than commonly used reference genes. When analyzed with NormFinder, both typical and custom reference genes were considered suitable internal controls, but the custom selected genes were more stably expressed. geNorm produced a similar result in which most custom selected genes ranked higher (i.e. were more stably expressed) than commonly used reference genes. CONCLUSIONS: The proposed method is innovative, rapid and simple. Since it does not depend on genome annotation, it can be used with any organism, and does not require pre-selected reference candidates or target genes that are not always available. |
format | Online Article Text |
id | pubmed-6954607 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-69546072020-01-14 Custom selected reference genes outperform pre-defined reference genes in transcriptomic analysis dos Santos, Karen Cristine Gonçalves Desgagné-Penix, Isabel Germain, Hugo BMC Genomics Methodology Article BACKGROUND: RNA sequencing allows the measuring of gene expression at a resolution unmet by expression arrays or RT-qPCR. It is however necessary to normalize sequencing data by library size, transcript size and composition, among other factors, before comparing expression levels. The use of internal control genes or spike-ins is advocated in the literature for scaling read counts, but the methods for choosing reference genes are mostly targeted at RT-qPCR studies and require a set of pre-selected candidate controls or pre-selected target genes. RESULTS: Here, we report an R-based pipeline to select internal control genes based solely on read counts and gene sizes. This novel method first normalizes the read counts to Transcripts per Million (TPM) and then excludes weakly expressed genes using the DAFS script to calculate the cut-off. It then selects as references the genes with lowest TPM coefficient of variation. We used this method to pick custom reference genes for the differential expression analysis of three transcriptome sets from transgenic Arabidopsis plants expressing heterologous fungal effector proteins tagged with GFP (using GFP alone as the control). The custom reference genes showed lower coefficient of variation and fold change as well as a broader range of expression levels than commonly used reference genes. When analyzed with NormFinder, both typical and custom reference genes were considered suitable internal controls, but the custom selected genes were more stably expressed. geNorm produced a similar result in which most custom selected genes ranked higher (i.e. were more stably expressed) than commonly used reference genes. CONCLUSIONS: The proposed method is innovative, rapid and simple. Since it does not depend on genome annotation, it can be used with any organism, and does not require pre-selected reference candidates or target genes that are not always available. BioMed Central 2020-01-10 /pmc/articles/PMC6954607/ /pubmed/31924161 http://dx.doi.org/10.1186/s12864-019-6426-2 Text en © The Author(s). 2020, corrected publication 2021 https://creativecommons.org/licenses/by/4.0/Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) ), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Methodology Article dos Santos, Karen Cristine Gonçalves Desgagné-Penix, Isabel Germain, Hugo Custom selected reference genes outperform pre-defined reference genes in transcriptomic analysis |
title | Custom selected reference genes outperform pre-defined reference genes in transcriptomic analysis |
title_full | Custom selected reference genes outperform pre-defined reference genes in transcriptomic analysis |
title_fullStr | Custom selected reference genes outperform pre-defined reference genes in transcriptomic analysis |
title_full_unstemmed | Custom selected reference genes outperform pre-defined reference genes in transcriptomic analysis |
title_short | Custom selected reference genes outperform pre-defined reference genes in transcriptomic analysis |
title_sort | custom selected reference genes outperform pre-defined reference genes in transcriptomic analysis |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6954607/ https://www.ncbi.nlm.nih.gov/pubmed/31924161 http://dx.doi.org/10.1186/s12864-019-6426-2 |
work_keys_str_mv | AT dossantoskarencristinegoncalves customselectedreferencegenesoutperformpredefinedreferencegenesintranscriptomicanalysis AT desgagnepenixisabel customselectedreferencegenesoutperformpredefinedreferencegenesintranscriptomicanalysis AT germainhugo customselectedreferencegenesoutperformpredefinedreferencegenesintranscriptomicanalysis |