Cargando…

Goodness-of-Fit Tests and Model Diagnostics for Negative Binomial Regression of RNA Sequencing Data

This work is about assessing model adequacy for negative binomial (NB) regression, particularly (1) assessing the adequacy of the NB assumption, and (2) assessing the appropriateness of models for NB dispersion parameters. Tools for the first are appropriate for NB regression generally; those for th...

Descripción completa

Detalles Bibliográficos
Autores principales: Mi, Gu, Di, Yanming, Schafer, Daniel W.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4365073/
https://www.ncbi.nlm.nih.gov/pubmed/25787144
http://dx.doi.org/10.1371/journal.pone.0119254
_version_ 1782362179791486976
author Mi, Gu
Di, Yanming
Schafer, Daniel W.
author_facet Mi, Gu
Di, Yanming
Schafer, Daniel W.
author_sort Mi, Gu
collection PubMed
description This work is about assessing model adequacy for negative binomial (NB) regression, particularly (1) assessing the adequacy of the NB assumption, and (2) assessing the appropriateness of models for NB dispersion parameters. Tools for the first are appropriate for NB regression generally; those for the second are primarily intended for RNA sequencing (RNA-Seq) data analysis. The typically small number of biological samples and large number of genes in RNA-Seq analysis motivate us to address the trade-offs between robustness and statistical power using NB regression models. One widely-used power-saving strategy, for example, is to assume some commonalities of NB dispersion parameters across genes via simple models relating them to mean expression rates, and many such models have been proposed. As RNA-Seq analysis is becoming ever more popular, it is appropriate to make more thorough investigations into power and robustness of the resulting methods, and into practical tools for model assessment. In this article, we propose simulation-based statistical tests and diagnostic graphics to address model adequacy. We provide simulated and real data examples to illustrate that our proposed methods are effective for detecting the misspecification of the NB mean-variance relationship as well as judging the adequacy of fit of several NB dispersion models.
format Online
Article
Text
id pubmed-4365073
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-43650732015-03-23 Goodness-of-Fit Tests and Model Diagnostics for Negative Binomial Regression of RNA Sequencing Data Mi, Gu Di, Yanming Schafer, Daniel W. PLoS One Research Article This work is about assessing model adequacy for negative binomial (NB) regression, particularly (1) assessing the adequacy of the NB assumption, and (2) assessing the appropriateness of models for NB dispersion parameters. Tools for the first are appropriate for NB regression generally; those for the second are primarily intended for RNA sequencing (RNA-Seq) data analysis. The typically small number of biological samples and large number of genes in RNA-Seq analysis motivate us to address the trade-offs between robustness and statistical power using NB regression models. One widely-used power-saving strategy, for example, is to assume some commonalities of NB dispersion parameters across genes via simple models relating them to mean expression rates, and many such models have been proposed. As RNA-Seq analysis is becoming ever more popular, it is appropriate to make more thorough investigations into power and robustness of the resulting methods, and into practical tools for model assessment. In this article, we propose simulation-based statistical tests and diagnostic graphics to address model adequacy. We provide simulated and real data examples to illustrate that our proposed methods are effective for detecting the misspecification of the NB mean-variance relationship as well as judging the adequacy of fit of several NB dispersion models. Public Library of Science 2015-03-18 /pmc/articles/PMC4365073/ /pubmed/25787144 http://dx.doi.org/10.1371/journal.pone.0119254 Text en © 2015 Mi et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Mi, Gu
Di, Yanming
Schafer, Daniel W.
Goodness-of-Fit Tests and Model Diagnostics for Negative Binomial Regression of RNA Sequencing Data
title Goodness-of-Fit Tests and Model Diagnostics for Negative Binomial Regression of RNA Sequencing Data
title_full Goodness-of-Fit Tests and Model Diagnostics for Negative Binomial Regression of RNA Sequencing Data
title_fullStr Goodness-of-Fit Tests and Model Diagnostics for Negative Binomial Regression of RNA Sequencing Data
title_full_unstemmed Goodness-of-Fit Tests and Model Diagnostics for Negative Binomial Regression of RNA Sequencing Data
title_short Goodness-of-Fit Tests and Model Diagnostics for Negative Binomial Regression of RNA Sequencing Data
title_sort goodness-of-fit tests and model diagnostics for negative binomial regression of rna sequencing data
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4365073/
https://www.ncbi.nlm.nih.gov/pubmed/25787144
http://dx.doi.org/10.1371/journal.pone.0119254
work_keys_str_mv AT migu goodnessoffittestsandmodeldiagnosticsfornegativebinomialregressionofrnasequencingdata
AT diyanming goodnessoffittestsandmodeldiagnosticsfornegativebinomialregressionofrnasequencingdata
AT schaferdanielw goodnessoffittestsandmodeldiagnosticsfornegativebinomialregressionofrnasequencingdata