Cargando…

A Powerful Statistical Approach for Large-Scale Differential Transcription Analysis

Next generation sequencing (NGS) is increasingly being used for transcriptome-wide analysis of differential gene expression. The NGS data are multidimensional count data. Therefore, most of the statistical methods developed well for microarray data analysis are not applicable to transcriptomic data....

Descripción completa

Detalles Bibliográficos
Autores principales: Tan, Yuan-De, Chandler, Anita M., Chaudhury, Arindam, Neilson, Joel R.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4404056/
https://www.ncbi.nlm.nih.gov/pubmed/25894390
http://dx.doi.org/10.1371/journal.pone.0123658
_version_ 1782367435237621760
author Tan, Yuan-De
Chandler, Anita M.
Chaudhury, Arindam
Neilson, Joel R.
author_facet Tan, Yuan-De
Chandler, Anita M.
Chaudhury, Arindam
Neilson, Joel R.
author_sort Tan, Yuan-De
collection PubMed
description Next generation sequencing (NGS) is increasingly being used for transcriptome-wide analysis of differential gene expression. The NGS data are multidimensional count data. Therefore, most of the statistical methods developed well for microarray data analysis are not applicable to transcriptomic data. For this reason, a variety of new statistical methods based on count data of transcript reads have been correspondingly proposed. But due to high cost and limitation of biological resources, current NGS data are still generated from a few replicate libraries. Some of these existing methods do not always have desirable performances on count data. We here developed a very powerful and robust statistical method based on beta and binomial distributions. Our method (mBeta t-test) is specifically applicable to sequence count data from small samples. Both simulated and real transcriptomic data showed mBeta t-test significantly outperformed the existing top statistical methods chosen in all 12 given scenarios and performed with high efficiency and high stability. The differentially expressed genes found by our method from real transcriptomic data were validated by qPCR experiments. Our method shows high power in finding truly differential expression, conservatively estimating FDR and high stability in RNA sequence count data derived from small samples. Our method can also be extended to genome-wide detection of differential splicing events.
format Online
Article
Text
id pubmed-4404056
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-44040562015-05-02 A Powerful Statistical Approach for Large-Scale Differential Transcription Analysis Tan, Yuan-De Chandler, Anita M. Chaudhury, Arindam Neilson, Joel R. PLoS One Research Article Next generation sequencing (NGS) is increasingly being used for transcriptome-wide analysis of differential gene expression. The NGS data are multidimensional count data. Therefore, most of the statistical methods developed well for microarray data analysis are not applicable to transcriptomic data. For this reason, a variety of new statistical methods based on count data of transcript reads have been correspondingly proposed. But due to high cost and limitation of biological resources, current NGS data are still generated from a few replicate libraries. Some of these existing methods do not always have desirable performances on count data. We here developed a very powerful and robust statistical method based on beta and binomial distributions. Our method (mBeta t-test) is specifically applicable to sequence count data from small samples. Both simulated and real transcriptomic data showed mBeta t-test significantly outperformed the existing top statistical methods chosen in all 12 given scenarios and performed with high efficiency and high stability. The differentially expressed genes found by our method from real transcriptomic data were validated by qPCR experiments. Our method shows high power in finding truly differential expression, conservatively estimating FDR and high stability in RNA sequence count data derived from small samples. Our method can also be extended to genome-wide detection of differential splicing events. Public Library of Science 2015-04-20 /pmc/articles/PMC4404056/ /pubmed/25894390 http://dx.doi.org/10.1371/journal.pone.0123658 Text en © 2015 Tan et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Tan, Yuan-De
Chandler, Anita M.
Chaudhury, Arindam
Neilson, Joel R.
A Powerful Statistical Approach for Large-Scale Differential Transcription Analysis
title A Powerful Statistical Approach for Large-Scale Differential Transcription Analysis
title_full A Powerful Statistical Approach for Large-Scale Differential Transcription Analysis
title_fullStr A Powerful Statistical Approach for Large-Scale Differential Transcription Analysis
title_full_unstemmed A Powerful Statistical Approach for Large-Scale Differential Transcription Analysis
title_short A Powerful Statistical Approach for Large-Scale Differential Transcription Analysis
title_sort powerful statistical approach for large-scale differential transcription analysis
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4404056/
https://www.ncbi.nlm.nih.gov/pubmed/25894390
http://dx.doi.org/10.1371/journal.pone.0123658
work_keys_str_mv AT tanyuande apowerfulstatisticalapproachforlargescaledifferentialtranscriptionanalysis
AT chandleranitam apowerfulstatisticalapproachforlargescaledifferentialtranscriptionanalysis
AT chaudhuryarindam apowerfulstatisticalapproachforlargescaledifferentialtranscriptionanalysis
AT neilsonjoelr apowerfulstatisticalapproachforlargescaledifferentialtranscriptionanalysis
AT tanyuande powerfulstatisticalapproachforlargescaledifferentialtranscriptionanalysis
AT chandleranitam powerfulstatisticalapproachforlargescaledifferentialtranscriptionanalysis
AT chaudhuryarindam powerfulstatisticalapproachforlargescaledifferentialtranscriptionanalysis
AT neilsonjoelr powerfulstatisticalapproachforlargescaledifferentialtranscriptionanalysis