Cargando…

Optimal Scaling of Digital Transcriptomes

Deep sequencing of transcriptomes has become an indispensable tool for biology, enabling expression levels for thousands of genes to be compared across multiple samples. Since transcript counts scale with sequencing depth, counts from different samples must be normalized to a common scale prior to c...

Descripción completa

Detalles Bibliográficos
Autores principales:	Glusman, Gustavo, Caballero, Juan, Robinson, Max, Kutlu, Burak, Hood, Leroy
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Public Library of Science 2013
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3819321/ https://www.ncbi.nlm.nih.gov/pubmed/24223126 http://dx.doi.org/10.1371/journal.pone.0077885

_version_	1782289974306013184
author	Glusman, Gustavo Caballero, Juan Robinson, Max Kutlu, Burak Hood, Leroy
author_facet	Glusman, Gustavo Caballero, Juan Robinson, Max Kutlu, Burak Hood, Leroy
author_sort	Glusman, Gustavo
collection	PubMed
description	Deep sequencing of transcriptomes has become an indispensable tool for biology, enabling expression levels for thousands of genes to be compared across multiple samples. Since transcript counts scale with sequencing depth, counts from different samples must be normalized to a common scale prior to comparison. We analyzed fifteen existing and novel algorithms for normalizing transcript counts, and evaluated the effectiveness of the resulting normalizations. For this purpose we defined two novel and mutually independent metrics: (1) the number of “uniform” genes (genes whose normalized expression levels have a sufficiently low coefficient of variation), and (2) low Spearman correlation between normalized expression profiles of gene pairs. We also define four novel algorithms, one of which explicitly maximizes the number of uniform genes, and compared the performance of all fifteen algorithms. The two most commonly used methods (scaling to a fixed total value, or equalizing the expression of certain ‘housekeeping’ genes) yielded particularly poor results, surpassed even by normalization based on randomly selected gene sets. Conversely, seven of the algorithms approached what appears to be optimal normalization. Three of these algorithms rely on the identification of “ubiquitous” genes: genes expressed in all the samples studied, but never at very high or very low levels. We demonstrate that these include a “core” of genes expressed in many tissues in a mutually consistent pattern, which is suitable for use as an internal normalization guide. The new methods yield robustly normalized expression values, which is a prerequisite for the identification of differentially expressed and tissue-specific genes as potential biomarkers.
format	Online Article Text
id	pubmed-3819321
institution	National Center for Biotechnology Information
language	English
publishDate	2013
publisher	Public Library of Science
record_format	MEDLINE/PubMed
spelling	pubmed-38193212013-11-12 Optimal Scaling of Digital Transcriptomes Glusman, Gustavo Caballero, Juan Robinson, Max Kutlu, Burak Hood, Leroy PLoS One Research Article Deep sequencing of transcriptomes has become an indispensable tool for biology, enabling expression levels for thousands of genes to be compared across multiple samples. Since transcript counts scale with sequencing depth, counts from different samples must be normalized to a common scale prior to comparison. We analyzed fifteen existing and novel algorithms for normalizing transcript counts, and evaluated the effectiveness of the resulting normalizations. For this purpose we defined two novel and mutually independent metrics: (1) the number of “uniform” genes (genes whose normalized expression levels have a sufficiently low coefficient of variation), and (2) low Spearman correlation between normalized expression profiles of gene pairs. We also define four novel algorithms, one of which explicitly maximizes the number of uniform genes, and compared the performance of all fifteen algorithms. The two most commonly used methods (scaling to a fixed total value, or equalizing the expression of certain ‘housekeeping’ genes) yielded particularly poor results, surpassed even by normalization based on randomly selected gene sets. Conversely, seven of the algorithms approached what appears to be optimal normalization. Three of these algorithms rely on the identification of “ubiquitous” genes: genes expressed in all the samples studied, but never at very high or very low levels. We demonstrate that these include a “core” of genes expressed in many tissues in a mutually consistent pattern, which is suitable for use as an internal normalization guide. The new methods yield robustly normalized expression values, which is a prerequisite for the identification of differentially expressed and tissue-specific genes as potential biomarkers. Public Library of Science 2013-11-06 /pmc/articles/PMC3819321/ /pubmed/24223126 http://dx.doi.org/10.1371/journal.pone.0077885 Text en © 2013 Glusman et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle	Research Article Glusman, Gustavo Caballero, Juan Robinson, Max Kutlu, Burak Hood, Leroy Optimal Scaling of Digital Transcriptomes
title	Optimal Scaling of Digital Transcriptomes
title_full	Optimal Scaling of Digital Transcriptomes
title_fullStr	Optimal Scaling of Digital Transcriptomes
title_full_unstemmed	Optimal Scaling of Digital Transcriptomes
title_short	Optimal Scaling of Digital Transcriptomes
title_sort	optimal scaling of digital transcriptomes
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3819321/ https://www.ncbi.nlm.nih.gov/pubmed/24223126 http://dx.doi.org/10.1371/journal.pone.0077885
work_keys_str_mv	AT glusmangustavo optimalscalingofdigitaltranscriptomes AT caballerojuan optimalscalingofdigitaltranscriptomes AT robinsonmax optimalscalingofdigitaltranscriptomes AT kutluburak optimalscalingofdigitaltranscriptomes AT hoodleroy optimalscalingofdigitaltranscriptomes

Optimal Scaling of Digital Transcriptomes

Ejemplares similares