Cargando…

Data-driven detection of subtype-specific differentially expressed genes

Among multiple subtypes of tissue or cell, subtype-specific differentially-expressed genes (SDEGs) are defined as being most-upregulated in only one subtype but not in any other. Detecting SDEGs plays a critical role in the molecular characterization and deconvolution of multicellular complex tissue...

Descripción completa

Detalles Bibliográficos
Autores principales: Chen, Lulu, Lu, Yingzhou, Wu, Chiung-Ting, Clarke, Robert, Yu, Guoqiang, Van Eyk, Jennifer E., Herrington, David M., Wang, Yue
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7801594/
https://www.ncbi.nlm.nih.gov/pubmed/33432005
http://dx.doi.org/10.1038/s41598-020-79704-1
_version_ 1783635606931243008
author Chen, Lulu
Lu, Yingzhou
Wu, Chiung-Ting
Clarke, Robert
Yu, Guoqiang
Van Eyk, Jennifer E.
Herrington, David M.
Wang, Yue
author_facet Chen, Lulu
Lu, Yingzhou
Wu, Chiung-Ting
Clarke, Robert
Yu, Guoqiang
Van Eyk, Jennifer E.
Herrington, David M.
Wang, Yue
author_sort Chen, Lulu
collection PubMed
description Among multiple subtypes of tissue or cell, subtype-specific differentially-expressed genes (SDEGs) are defined as being most-upregulated in only one subtype but not in any other. Detecting SDEGs plays a critical role in the molecular characterization and deconvolution of multicellular complex tissues. Classic differential analysis assumes a null hypothesis whose test statistic is not subtype-specific, thus can produce a high false positive rate and/or lower detection power. Here we first introduce a One-Versus-Everyone Fold Change (OVE-FC) test for detecting SDEGs. We then propose a scaled test statistic (OVE-sFC) for assessing the statistical significance of SDEGs that applies a mixture null distribution model and a tailored permutation test. The OVE-FC/sFC test was validated on both type 1 error rate and detection power using extensive simulation data sets generated from real gene expression profiles of purified subtype samples. The OVE-FC/sFC test was then applied to two benchmark gene expression data sets of purified subtype samples and detected many known or previously unknown SDEGs. Subsequent supervised deconvolution results on synthesized bulk expression data, obtained using the SDEGs detected from the independent purified expression data by the OVE-FC/sFC test, showed superior performance in deconvolution accuracy when compared with popular peer methods.
format Online
Article
Text
id pubmed-7801594
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-78015942021-01-12 Data-driven detection of subtype-specific differentially expressed genes Chen, Lulu Lu, Yingzhou Wu, Chiung-Ting Clarke, Robert Yu, Guoqiang Van Eyk, Jennifer E. Herrington, David M. Wang, Yue Sci Rep Article Among multiple subtypes of tissue or cell, subtype-specific differentially-expressed genes (SDEGs) are defined as being most-upregulated in only one subtype but not in any other. Detecting SDEGs plays a critical role in the molecular characterization and deconvolution of multicellular complex tissues. Classic differential analysis assumes a null hypothesis whose test statistic is not subtype-specific, thus can produce a high false positive rate and/or lower detection power. Here we first introduce a One-Versus-Everyone Fold Change (OVE-FC) test for detecting SDEGs. We then propose a scaled test statistic (OVE-sFC) for assessing the statistical significance of SDEGs that applies a mixture null distribution model and a tailored permutation test. The OVE-FC/sFC test was validated on both type 1 error rate and detection power using extensive simulation data sets generated from real gene expression profiles of purified subtype samples. The OVE-FC/sFC test was then applied to two benchmark gene expression data sets of purified subtype samples and detected many known or previously unknown SDEGs. Subsequent supervised deconvolution results on synthesized bulk expression data, obtained using the SDEGs detected from the independent purified expression data by the OVE-FC/sFC test, showed superior performance in deconvolution accuracy when compared with popular peer methods. Nature Publishing Group UK 2021-01-11 /pmc/articles/PMC7801594/ /pubmed/33432005 http://dx.doi.org/10.1038/s41598-020-79704-1 Text en © The Author(s) 2021 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
spellingShingle Article
Chen, Lulu
Lu, Yingzhou
Wu, Chiung-Ting
Clarke, Robert
Yu, Guoqiang
Van Eyk, Jennifer E.
Herrington, David M.
Wang, Yue
Data-driven detection of subtype-specific differentially expressed genes
title Data-driven detection of subtype-specific differentially expressed genes
title_full Data-driven detection of subtype-specific differentially expressed genes
title_fullStr Data-driven detection of subtype-specific differentially expressed genes
title_full_unstemmed Data-driven detection of subtype-specific differentially expressed genes
title_short Data-driven detection of subtype-specific differentially expressed genes
title_sort data-driven detection of subtype-specific differentially expressed genes
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7801594/
https://www.ncbi.nlm.nih.gov/pubmed/33432005
http://dx.doi.org/10.1038/s41598-020-79704-1
work_keys_str_mv AT chenlulu datadrivendetectionofsubtypespecificdifferentiallyexpressedgenes
AT luyingzhou datadrivendetectionofsubtypespecificdifferentiallyexpressedgenes
AT wuchiungting datadrivendetectionofsubtypespecificdifferentiallyexpressedgenes
AT clarkerobert datadrivendetectionofsubtypespecificdifferentiallyexpressedgenes
AT yuguoqiang datadrivendetectionofsubtypespecificdifferentiallyexpressedgenes
AT vaneykjennifere datadrivendetectionofsubtypespecificdifferentiallyexpressedgenes
AT herringtondavidm datadrivendetectionofsubtypespecificdifferentiallyexpressedgenes
AT wangyue datadrivendetectionofsubtypespecificdifferentiallyexpressedgenes