Cargando…

Differential gene expression analysis based on linear mixed model corrects false positive inflation for studying quantitative traits

Differential gene expression (DGE) analysis has been widely employed to identify genes expressed differentially with respect to a trait of interest using RNA sequencing (RNA-Seq) data. Recent RNA-Seq data with large samples pose challenges to existing DGE methods, which were mainly developed for dic...

Descripción completa

Detalles Bibliográficos
Autores principales: Tang, Shizhen, Buchman, Aron S., Wang, Yanling, Avey, Denis, Xu, Jishu, Tasaki, Shinya, Bennett, David A., Zheng, Qi, Yang, Jingjing
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10547771/
https://www.ncbi.nlm.nih.gov/pubmed/37789141
http://dx.doi.org/10.1038/s41598-023-43686-7
_version_ 1785115127501029376
author Tang, Shizhen
Buchman, Aron S.
Wang, Yanling
Avey, Denis
Xu, Jishu
Tasaki, Shinya
Bennett, David A.
Zheng, Qi
Yang, Jingjing
author_facet Tang, Shizhen
Buchman, Aron S.
Wang, Yanling
Avey, Denis
Xu, Jishu
Tasaki, Shinya
Bennett, David A.
Zheng, Qi
Yang, Jingjing
author_sort Tang, Shizhen
collection PubMed
description Differential gene expression (DGE) analysis has been widely employed to identify genes expressed differentially with respect to a trait of interest using RNA sequencing (RNA-Seq) data. Recent RNA-Seq data with large samples pose challenges to existing DGE methods, which were mainly developed for dichotomous traits and small sample sizes. Especially, existing DGE methods are likely to result in inflated false positive rates. To address this gap, we employed a linear mixed model (LMM) that has been widely used in genetic association studies for DGE analysis of quantitative traits. We first applied the LMM method to the discovery RNA-Seq data of dorsolateral prefrontal cortex (DLPFC) tissue (n = 632) with four continuous measures of Alzheimer’s Disease (AD) cognitive and neuropathologic traits. The quantile–quantile plots of p-values showed that false positive rates were well calibrated by LMM, whereas other methods not accounting for sample-specific mixed effects led to serious inflation. LMM identified 37 potentially significant genes with differential expression in DLPFC for at least one of the AD traits, 17 of which were replicated in the additional RNA-Seq data of DLPFC, supplemental motor area, spinal cord, and muscle tissues. This application study showed not only well calibrated DGE results by LMM, but also possibly shared gene regulatory mechanisms of AD traits across different relevant tissues.
format Online
Article
Text
id pubmed-10547771
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-105477712023-10-05 Differential gene expression analysis based on linear mixed model corrects false positive inflation for studying quantitative traits Tang, Shizhen Buchman, Aron S. Wang, Yanling Avey, Denis Xu, Jishu Tasaki, Shinya Bennett, David A. Zheng, Qi Yang, Jingjing Sci Rep Article Differential gene expression (DGE) analysis has been widely employed to identify genes expressed differentially with respect to a trait of interest using RNA sequencing (RNA-Seq) data. Recent RNA-Seq data with large samples pose challenges to existing DGE methods, which were mainly developed for dichotomous traits and small sample sizes. Especially, existing DGE methods are likely to result in inflated false positive rates. To address this gap, we employed a linear mixed model (LMM) that has been widely used in genetic association studies for DGE analysis of quantitative traits. We first applied the LMM method to the discovery RNA-Seq data of dorsolateral prefrontal cortex (DLPFC) tissue (n = 632) with four continuous measures of Alzheimer’s Disease (AD) cognitive and neuropathologic traits. The quantile–quantile plots of p-values showed that false positive rates were well calibrated by LMM, whereas other methods not accounting for sample-specific mixed effects led to serious inflation. LMM identified 37 potentially significant genes with differential expression in DLPFC for at least one of the AD traits, 17 of which were replicated in the additional RNA-Seq data of DLPFC, supplemental motor area, spinal cord, and muscle tissues. This application study showed not only well calibrated DGE results by LMM, but also possibly shared gene regulatory mechanisms of AD traits across different relevant tissues. Nature Publishing Group UK 2023-10-03 /pmc/articles/PMC10547771/ /pubmed/37789141 http://dx.doi.org/10.1038/s41598-023-43686-7 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Tang, Shizhen
Buchman, Aron S.
Wang, Yanling
Avey, Denis
Xu, Jishu
Tasaki, Shinya
Bennett, David A.
Zheng, Qi
Yang, Jingjing
Differential gene expression analysis based on linear mixed model corrects false positive inflation for studying quantitative traits
title Differential gene expression analysis based on linear mixed model corrects false positive inflation for studying quantitative traits
title_full Differential gene expression analysis based on linear mixed model corrects false positive inflation for studying quantitative traits
title_fullStr Differential gene expression analysis based on linear mixed model corrects false positive inflation for studying quantitative traits
title_full_unstemmed Differential gene expression analysis based on linear mixed model corrects false positive inflation for studying quantitative traits
title_short Differential gene expression analysis based on linear mixed model corrects false positive inflation for studying quantitative traits
title_sort differential gene expression analysis based on linear mixed model corrects false positive inflation for studying quantitative traits
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10547771/
https://www.ncbi.nlm.nih.gov/pubmed/37789141
http://dx.doi.org/10.1038/s41598-023-43686-7
work_keys_str_mv AT tangshizhen differentialgeneexpressionanalysisbasedonlinearmixedmodelcorrectsfalsepositiveinflationforstudyingquantitativetraits
AT buchmanarons differentialgeneexpressionanalysisbasedonlinearmixedmodelcorrectsfalsepositiveinflationforstudyingquantitativetraits
AT wangyanling differentialgeneexpressionanalysisbasedonlinearmixedmodelcorrectsfalsepositiveinflationforstudyingquantitativetraits
AT aveydenis differentialgeneexpressionanalysisbasedonlinearmixedmodelcorrectsfalsepositiveinflationforstudyingquantitativetraits
AT xujishu differentialgeneexpressionanalysisbasedonlinearmixedmodelcorrectsfalsepositiveinflationforstudyingquantitativetraits
AT tasakishinya differentialgeneexpressionanalysisbasedonlinearmixedmodelcorrectsfalsepositiveinflationforstudyingquantitativetraits
AT bennettdavida differentialgeneexpressionanalysisbasedonlinearmixedmodelcorrectsfalsepositiveinflationforstudyingquantitativetraits
AT zhengqi differentialgeneexpressionanalysisbasedonlinearmixedmodelcorrectsfalsepositiveinflationforstudyingquantitativetraits
AT yangjingjing differentialgeneexpressionanalysisbasedonlinearmixedmodelcorrectsfalsepositiveinflationforstudyingquantitativetraits