Cargando…

A Poisson reduced-rank regression model for association mapping in sequencing data

BACKGROUND: Single-cell RNA-sequencing (scRNA-seq) technologies allow for the study of gene expression in individual cells. Often, it is of interest to understand how transcriptional activity is associated with cell-specific covariates, such as cell type, genotype, or measures of cell health. Tradit...

Descripción completa

Detalles Bibliográficos
Autores principales: Fitzgerald, Tiana, Jones, Andrew, Engelhardt, Barbara E.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9733401/
https://www.ncbi.nlm.nih.gov/pubmed/36482321
http://dx.doi.org/10.1186/s12859-022-05054-6
_version_ 1784846367834767360
author Fitzgerald, Tiana
Jones, Andrew
Engelhardt, Barbara E.
author_facet Fitzgerald, Tiana
Jones, Andrew
Engelhardt, Barbara E.
author_sort Fitzgerald, Tiana
collection PubMed
description BACKGROUND: Single-cell RNA-sequencing (scRNA-seq) technologies allow for the study of gene expression in individual cells. Often, it is of interest to understand how transcriptional activity is associated with cell-specific covariates, such as cell type, genotype, or measures of cell health. Traditional approaches for this type of association mapping assume independence between the outcome variables (or genes), and perform a separate regression for each. However, these methods are computationally costly and ignore the substantial correlation structure of gene expression. Furthermore, count-based scRNA-seq data pose challenges for traditional models based on Gaussian assumptions. RESULTS: We aim to resolve these issues by developing a reduced-rank regression model that identifies low-dimensional linear associations between a large number of cell-specific covariates and high-dimensional gene expression readouts. Our probabilistic model uses a Poisson likelihood in order to account for the unique structure of scRNA-seq counts. We demonstrate the performance of our model using simulations, and we apply our model to a scRNA-seq dataset, a spatial gene expression dataset, and a bulk RNA-seq dataset to show its behavior in three distinct analyses. CONCLUSION: We show that our statistical modeling approach, which is based on reduced-rank regression, captures associations between gene expression and cell- and sample-specific covariates by leveraging low-dimensional representations of transcriptional states. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-022-05054-6.
format Online
Article
Text
id pubmed-9733401
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-97334012022-12-10 A Poisson reduced-rank regression model for association mapping in sequencing data Fitzgerald, Tiana Jones, Andrew Engelhardt, Barbara E. BMC Bioinformatics Research BACKGROUND: Single-cell RNA-sequencing (scRNA-seq) technologies allow for the study of gene expression in individual cells. Often, it is of interest to understand how transcriptional activity is associated with cell-specific covariates, such as cell type, genotype, or measures of cell health. Traditional approaches for this type of association mapping assume independence between the outcome variables (or genes), and perform a separate regression for each. However, these methods are computationally costly and ignore the substantial correlation structure of gene expression. Furthermore, count-based scRNA-seq data pose challenges for traditional models based on Gaussian assumptions. RESULTS: We aim to resolve these issues by developing a reduced-rank regression model that identifies low-dimensional linear associations between a large number of cell-specific covariates and high-dimensional gene expression readouts. Our probabilistic model uses a Poisson likelihood in order to account for the unique structure of scRNA-seq counts. We demonstrate the performance of our model using simulations, and we apply our model to a scRNA-seq dataset, a spatial gene expression dataset, and a bulk RNA-seq dataset to show its behavior in three distinct analyses. CONCLUSION: We show that our statistical modeling approach, which is based on reduced-rank regression, captures associations between gene expression and cell- and sample-specific covariates by leveraging low-dimensional representations of transcriptional states. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-022-05054-6. BioMed Central 2022-12-08 /pmc/articles/PMC9733401/ /pubmed/36482321 http://dx.doi.org/10.1186/s12859-022-05054-6 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Fitzgerald, Tiana
Jones, Andrew
Engelhardt, Barbara E.
A Poisson reduced-rank regression model for association mapping in sequencing data
title A Poisson reduced-rank regression model for association mapping in sequencing data
title_full A Poisson reduced-rank regression model for association mapping in sequencing data
title_fullStr A Poisson reduced-rank regression model for association mapping in sequencing data
title_full_unstemmed A Poisson reduced-rank regression model for association mapping in sequencing data
title_short A Poisson reduced-rank regression model for association mapping in sequencing data
title_sort poisson reduced-rank regression model for association mapping in sequencing data
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9733401/
https://www.ncbi.nlm.nih.gov/pubmed/36482321
http://dx.doi.org/10.1186/s12859-022-05054-6
work_keys_str_mv AT fitzgeraldtiana apoissonreducedrankregressionmodelforassociationmappinginsequencingdata
AT jonesandrew apoissonreducedrankregressionmodelforassociationmappinginsequencingdata
AT engelhardtbarbarae apoissonreducedrankregressionmodelforassociationmappinginsequencingdata
AT fitzgeraldtiana poissonreducedrankregressionmodelforassociationmappinginsequencingdata
AT jonesandrew poissonreducedrankregressionmodelforassociationmappinginsequencingdata
AT engelhardtbarbarae poissonreducedrankregressionmodelforassociationmappinginsequencingdata