Cargando…

MCMSeq: Bayesian hierarchical modeling of clustered and repeated measures RNA sequencing experiments

BACKGROUND: As the barriers to incorporating RNA sequencing (RNA-Seq) into biomedical studies continue to decrease, the complexity and size of RNA-Seq experiments are rapidly growing. Paired, longitudinal, and other correlated designs are becoming commonplace, and these studies offer immense potenti...

Descripción completa

Detalles Bibliográficos
Autores principales: Vestal, Brian E., Moore, Camille M., Wynn, Elizabeth, Saba, Laura, Fingerlin, Tasha, Kechris, Katerina
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7455910/
https://www.ncbi.nlm.nih.gov/pubmed/32859148
http://dx.doi.org/10.1186/s12859-020-03715-y
_version_ 1783575714323234816
author Vestal, Brian E.
Moore, Camille M.
Wynn, Elizabeth
Saba, Laura
Fingerlin, Tasha
Kechris, Katerina
author_facet Vestal, Brian E.
Moore, Camille M.
Wynn, Elizabeth
Saba, Laura
Fingerlin, Tasha
Kechris, Katerina
author_sort Vestal, Brian E.
collection PubMed
description BACKGROUND: As the barriers to incorporating RNA sequencing (RNA-Seq) into biomedical studies continue to decrease, the complexity and size of RNA-Seq experiments are rapidly growing. Paired, longitudinal, and other correlated designs are becoming commonplace, and these studies offer immense potential for understanding how transcriptional changes within an individual over time differ depending on treatment or environmental conditions. While several methods have been proposed for dealing with repeated measures within RNA-Seq analyses, they are either restricted to handling only paired measurements, can only test for differences between two groups, and/or have issues with maintaining nominal false positive and false discovery rates. In this work, we propose a Bayesian hierarchical negative binomial generalized linear mixed model framework that can flexibly model RNA-Seq counts from studies with arbitrarily many repeated observations, can include covariates, and also maintains nominal false positive and false discovery rates in its posterior inference. RESULTS: In simulation studies, we showed that our proposed method (MCMSeq) best combines high statistical power (i.e. sensitivity or recall) with maintenance of nominal false positive and false discovery rates compared the other available strategies, especially at the smaller sample sizes investigated. This behavior was then replicated in an application to real RNA-Seq data where MCMSeq was able to find previously reported genes associated with tuberculosis infection in a cohort with longitudinal measurements. CONCLUSIONS: Failing to account for repeated measurements when analyzing RNA-Seq experiments can result in significantly inflated false positive and false discovery rates. Of the methods we investigated, whether they model RNA-Seq counts directly or worked on transformed values, the Bayesian hierarchical model implemented in the mcmseq R package (available at https://github.com/stop-pre16/mcmseq) best combined sensitivity and nominal error rate control.
format Online
Article
Text
id pubmed-7455910
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-74559102020-08-31 MCMSeq: Bayesian hierarchical modeling of clustered and repeated measures RNA sequencing experiments Vestal, Brian E. Moore, Camille M. Wynn, Elizabeth Saba, Laura Fingerlin, Tasha Kechris, Katerina BMC Bioinformatics Methodology Article BACKGROUND: As the barriers to incorporating RNA sequencing (RNA-Seq) into biomedical studies continue to decrease, the complexity and size of RNA-Seq experiments are rapidly growing. Paired, longitudinal, and other correlated designs are becoming commonplace, and these studies offer immense potential for understanding how transcriptional changes within an individual over time differ depending on treatment or environmental conditions. While several methods have been proposed for dealing with repeated measures within RNA-Seq analyses, they are either restricted to handling only paired measurements, can only test for differences between two groups, and/or have issues with maintaining nominal false positive and false discovery rates. In this work, we propose a Bayesian hierarchical negative binomial generalized linear mixed model framework that can flexibly model RNA-Seq counts from studies with arbitrarily many repeated observations, can include covariates, and also maintains nominal false positive and false discovery rates in its posterior inference. RESULTS: In simulation studies, we showed that our proposed method (MCMSeq) best combines high statistical power (i.e. sensitivity or recall) with maintenance of nominal false positive and false discovery rates compared the other available strategies, especially at the smaller sample sizes investigated. This behavior was then replicated in an application to real RNA-Seq data where MCMSeq was able to find previously reported genes associated with tuberculosis infection in a cohort with longitudinal measurements. CONCLUSIONS: Failing to account for repeated measurements when analyzing RNA-Seq experiments can result in significantly inflated false positive and false discovery rates. Of the methods we investigated, whether they model RNA-Seq counts directly or worked on transformed values, the Bayesian hierarchical model implemented in the mcmseq R package (available at https://github.com/stop-pre16/mcmseq) best combined sensitivity and nominal error rate control. BioMed Central 2020-08-28 /pmc/articles/PMC7455910/ /pubmed/32859148 http://dx.doi.org/10.1186/s12859-020-03715-y Text en © The Author(s) 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Methodology Article
Vestal, Brian E.
Moore, Camille M.
Wynn, Elizabeth
Saba, Laura
Fingerlin, Tasha
Kechris, Katerina
MCMSeq: Bayesian hierarchical modeling of clustered and repeated measures RNA sequencing experiments
title MCMSeq: Bayesian hierarchical modeling of clustered and repeated measures RNA sequencing experiments
title_full MCMSeq: Bayesian hierarchical modeling of clustered and repeated measures RNA sequencing experiments
title_fullStr MCMSeq: Bayesian hierarchical modeling of clustered and repeated measures RNA sequencing experiments
title_full_unstemmed MCMSeq: Bayesian hierarchical modeling of clustered and repeated measures RNA sequencing experiments
title_short MCMSeq: Bayesian hierarchical modeling of clustered and repeated measures RNA sequencing experiments
title_sort mcmseq: bayesian hierarchical modeling of clustered and repeated measures rna sequencing experiments
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7455910/
https://www.ncbi.nlm.nih.gov/pubmed/32859148
http://dx.doi.org/10.1186/s12859-020-03715-y
work_keys_str_mv AT vestalbriane mcmseqbayesianhierarchicalmodelingofclusteredandrepeatedmeasuresrnasequencingexperiments
AT moorecamillem mcmseqbayesianhierarchicalmodelingofclusteredandrepeatedmeasuresrnasequencingexperiments
AT wynnelizabeth mcmseqbayesianhierarchicalmodelingofclusteredandrepeatedmeasuresrnasequencingexperiments
AT sabalaura mcmseqbayesianhierarchicalmodelingofclusteredandrepeatedmeasuresrnasequencingexperiments
AT fingerlintasha mcmseqbayesianhierarchicalmodelingofclusteredandrepeatedmeasuresrnasequencingexperiments
AT kechriskaterina mcmseqbayesianhierarchicalmodelingofclusteredandrepeatedmeasuresrnasequencingexperiments