Cargando…
A Poisson reduced-rank regression model for association mapping in sequencing data
BACKGROUND: Single-cell RNA-sequencing (scRNA-seq) technologies allow for the study of gene expression in individual cells. Often, it is of interest to understand how transcriptional activity is associated with cell-specific covariates, such as cell type, genotype, or measures of cell health. Tradit...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9733401/ https://www.ncbi.nlm.nih.gov/pubmed/36482321 http://dx.doi.org/10.1186/s12859-022-05054-6 |
_version_ | 1784846367834767360 |
---|---|
author | Fitzgerald, Tiana Jones, Andrew Engelhardt, Barbara E. |
author_facet | Fitzgerald, Tiana Jones, Andrew Engelhardt, Barbara E. |
author_sort | Fitzgerald, Tiana |
collection | PubMed |
description | BACKGROUND: Single-cell RNA-sequencing (scRNA-seq) technologies allow for the study of gene expression in individual cells. Often, it is of interest to understand how transcriptional activity is associated with cell-specific covariates, such as cell type, genotype, or measures of cell health. Traditional approaches for this type of association mapping assume independence between the outcome variables (or genes), and perform a separate regression for each. However, these methods are computationally costly and ignore the substantial correlation structure of gene expression. Furthermore, count-based scRNA-seq data pose challenges for traditional models based on Gaussian assumptions. RESULTS: We aim to resolve these issues by developing a reduced-rank regression model that identifies low-dimensional linear associations between a large number of cell-specific covariates and high-dimensional gene expression readouts. Our probabilistic model uses a Poisson likelihood in order to account for the unique structure of scRNA-seq counts. We demonstrate the performance of our model using simulations, and we apply our model to a scRNA-seq dataset, a spatial gene expression dataset, and a bulk RNA-seq dataset to show its behavior in three distinct analyses. CONCLUSION: We show that our statistical modeling approach, which is based on reduced-rank regression, captures associations between gene expression and cell- and sample-specific covariates by leveraging low-dimensional representations of transcriptional states. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-022-05054-6. |
format | Online Article Text |
id | pubmed-9733401 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-97334012022-12-10 A Poisson reduced-rank regression model for association mapping in sequencing data Fitzgerald, Tiana Jones, Andrew Engelhardt, Barbara E. BMC Bioinformatics Research BACKGROUND: Single-cell RNA-sequencing (scRNA-seq) technologies allow for the study of gene expression in individual cells. Often, it is of interest to understand how transcriptional activity is associated with cell-specific covariates, such as cell type, genotype, or measures of cell health. Traditional approaches for this type of association mapping assume independence between the outcome variables (or genes), and perform a separate regression for each. However, these methods are computationally costly and ignore the substantial correlation structure of gene expression. Furthermore, count-based scRNA-seq data pose challenges for traditional models based on Gaussian assumptions. RESULTS: We aim to resolve these issues by developing a reduced-rank regression model that identifies low-dimensional linear associations between a large number of cell-specific covariates and high-dimensional gene expression readouts. Our probabilistic model uses a Poisson likelihood in order to account for the unique structure of scRNA-seq counts. We demonstrate the performance of our model using simulations, and we apply our model to a scRNA-seq dataset, a spatial gene expression dataset, and a bulk RNA-seq dataset to show its behavior in three distinct analyses. CONCLUSION: We show that our statistical modeling approach, which is based on reduced-rank regression, captures associations between gene expression and cell- and sample-specific covariates by leveraging low-dimensional representations of transcriptional states. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-022-05054-6. BioMed Central 2022-12-08 /pmc/articles/PMC9733401/ /pubmed/36482321 http://dx.doi.org/10.1186/s12859-022-05054-6 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Fitzgerald, Tiana Jones, Andrew Engelhardt, Barbara E. A Poisson reduced-rank regression model for association mapping in sequencing data |
title | A Poisson reduced-rank regression model for association mapping in sequencing data |
title_full | A Poisson reduced-rank regression model for association mapping in sequencing data |
title_fullStr | A Poisson reduced-rank regression model for association mapping in sequencing data |
title_full_unstemmed | A Poisson reduced-rank regression model for association mapping in sequencing data |
title_short | A Poisson reduced-rank regression model for association mapping in sequencing data |
title_sort | poisson reduced-rank regression model for association mapping in sequencing data |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9733401/ https://www.ncbi.nlm.nih.gov/pubmed/36482321 http://dx.doi.org/10.1186/s12859-022-05054-6 |
work_keys_str_mv | AT fitzgeraldtiana apoissonreducedrankregressionmodelforassociationmappinginsequencingdata AT jonesandrew apoissonreducedrankregressionmodelforassociationmappinginsequencingdata AT engelhardtbarbarae apoissonreducedrankregressionmodelforassociationmappinginsequencingdata AT fitzgeraldtiana poissonreducedrankregressionmodelforassociationmappinginsequencingdata AT jonesandrew poissonreducedrankregressionmodelforassociationmappinginsequencingdata AT engelhardtbarbarae poissonreducedrankregressionmodelforassociationmappinginsequencingdata |