Cargando…
The Impact of Normalization Methods on RNA-Seq Data Analysis
High-throughput sequencing technologies, such as the Illumina Hi-seq, are powerful new tools for investigating a wide range of biological and medical problems. Massive and complex data sets produced by the sequencers create a need for development of statistical and computational methods that can tac...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Hindawi Publishing Corporation
2015
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4484837/ https://www.ncbi.nlm.nih.gov/pubmed/26176014 http://dx.doi.org/10.1155/2015/621690 |
_version_ | 1782378717930061824 |
---|---|
author | Zyprych-Walczak, J. Szabelska, A. Handschuh, L. Górczak, K. Klamecka, K. Figlerowicz, M. Siatkowski, I. |
author_facet | Zyprych-Walczak, J. Szabelska, A. Handschuh, L. Górczak, K. Klamecka, K. Figlerowicz, M. Siatkowski, I. |
author_sort | Zyprych-Walczak, J. |
collection | PubMed |
description | High-throughput sequencing technologies, such as the Illumina Hi-seq, are powerful new tools for investigating a wide range of biological and medical problems. Massive and complex data sets produced by the sequencers create a need for development of statistical and computational methods that can tackle the analysis and management of data. The data normalization is one of the most crucial steps of data processing and this process must be carefully considered as it has a profound effect on the results of the analysis. In this work, we focus on a comprehensive comparison of five normalization methods related to sequencing depth, widely used for transcriptome sequencing (RNA-seq) data, and their impact on the results of gene expression analysis. Based on this study, we suggest a universal workflow that can be applied for the selection of the optimal normalization procedure for any particular data set. The described workflow includes calculation of the bias and variance values for the control genes, sensitivity and specificity of the methods, and classification errors as well as generation of the diagnostic plots. Combining the above information facilitates the selection of the most appropriate normalization method for the studied data sets and determines which methods can be used interchangeably. |
format | Online Article Text |
id | pubmed-4484837 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2015 |
publisher | Hindawi Publishing Corporation |
record_format | MEDLINE/PubMed |
spelling | pubmed-44848372015-07-14 The Impact of Normalization Methods on RNA-Seq Data Analysis Zyprych-Walczak, J. Szabelska, A. Handschuh, L. Górczak, K. Klamecka, K. Figlerowicz, M. Siatkowski, I. Biomed Res Int Research Article High-throughput sequencing technologies, such as the Illumina Hi-seq, are powerful new tools for investigating a wide range of biological and medical problems. Massive and complex data sets produced by the sequencers create a need for development of statistical and computational methods that can tackle the analysis and management of data. The data normalization is one of the most crucial steps of data processing and this process must be carefully considered as it has a profound effect on the results of the analysis. In this work, we focus on a comprehensive comparison of five normalization methods related to sequencing depth, widely used for transcriptome sequencing (RNA-seq) data, and their impact on the results of gene expression analysis. Based on this study, we suggest a universal workflow that can be applied for the selection of the optimal normalization procedure for any particular data set. The described workflow includes calculation of the bias and variance values for the control genes, sensitivity and specificity of the methods, and classification errors as well as generation of the diagnostic plots. Combining the above information facilitates the selection of the most appropriate normalization method for the studied data sets and determines which methods can be used interchangeably. Hindawi Publishing Corporation 2015 2015-06-15 /pmc/articles/PMC4484837/ /pubmed/26176014 http://dx.doi.org/10.1155/2015/621690 Text en Copyright © 2015 J. Zyprych-Walczak et al. https://creativecommons.org/licenses/by/3.0/ This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Zyprych-Walczak, J. Szabelska, A. Handschuh, L. Górczak, K. Klamecka, K. Figlerowicz, M. Siatkowski, I. The Impact of Normalization Methods on RNA-Seq Data Analysis |
title | The Impact of Normalization Methods on RNA-Seq Data Analysis |
title_full | The Impact of Normalization Methods on RNA-Seq Data Analysis |
title_fullStr | The Impact of Normalization Methods on RNA-Seq Data Analysis |
title_full_unstemmed | The Impact of Normalization Methods on RNA-Seq Data Analysis |
title_short | The Impact of Normalization Methods on RNA-Seq Data Analysis |
title_sort | impact of normalization methods on rna-seq data analysis |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4484837/ https://www.ncbi.nlm.nih.gov/pubmed/26176014 http://dx.doi.org/10.1155/2015/621690 |
work_keys_str_mv | AT zyprychwalczakj theimpactofnormalizationmethodsonrnaseqdataanalysis AT szabelskaa theimpactofnormalizationmethodsonrnaseqdataanalysis AT handschuhl theimpactofnormalizationmethodsonrnaseqdataanalysis AT gorczakk theimpactofnormalizationmethodsonrnaseqdataanalysis AT klameckak theimpactofnormalizationmethodsonrnaseqdataanalysis AT figlerowiczm theimpactofnormalizationmethodsonrnaseqdataanalysis AT siatkowskii theimpactofnormalizationmethodsonrnaseqdataanalysis AT zyprychwalczakj impactofnormalizationmethodsonrnaseqdataanalysis AT szabelskaa impactofnormalizationmethodsonrnaseqdataanalysis AT handschuhl impactofnormalizationmethodsonrnaseqdataanalysis AT gorczakk impactofnormalizationmethodsonrnaseqdataanalysis AT klameckak impactofnormalizationmethodsonrnaseqdataanalysis AT figlerowiczm impactofnormalizationmethodsonrnaseqdataanalysis AT siatkowskii impactofnormalizationmethodsonrnaseqdataanalysis |