Cargando…

Hypoxia classifier for transcriptome datasets

Molecular gene signatures are useful tools to characterize the physiological state of cell populations, but most have developed under a narrow range of conditions and cell types and are often restricted to a set of gene identities. Focusing on the transcriptional response to hypoxia, we aimed to gen...

Descripción completa

Detalles Bibliográficos
Autores principales: Puente-Santamaría, Laura, Sanchez-Gonzalez, Lucia, Ramos-Ruiz, Ricardo, del Peso, Luis
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9153107/
https://www.ncbi.nlm.nih.gov/pubmed/35641902
http://dx.doi.org/10.1186/s12859-022-04741-8
_version_ 1784717780929478656
author Puente-Santamaría, Laura
Sanchez-Gonzalez, Lucia
Ramos-Ruiz, Ricardo
del Peso, Luis
author_facet Puente-Santamaría, Laura
Sanchez-Gonzalez, Lucia
Ramos-Ruiz, Ricardo
del Peso, Luis
author_sort Puente-Santamaría, Laura
collection PubMed
description Molecular gene signatures are useful tools to characterize the physiological state of cell populations, but most have developed under a narrow range of conditions and cell types and are often restricted to a set of gene identities. Focusing on the transcriptional response to hypoxia, we aimed to generate widely applicable classifiers sourced from the results of a meta-analysis of 69 differential expression datasets which included 425 individual RNA-seq experiments from 33 different human cell types exposed to different degrees of hypoxia (0.1–5%[Formula: see text] ) for 2–48 h. The resulting decision trees include both gene identities and quantitative boundaries, allowing for easy classification of individual samples without control or normoxic reference. Each tree is composed of 3–5 genes mostly drawn from a small set of just 8 genes (EGLN1, MIR210HG, NDRG1, ANKRD37, TCAF2, PFKFB3, BHLHE40, and MAFF). In spite of their simplicity, these classifiers achieve over 95% accuracy in cross validation and over 80% accuracy when applied to additional challenging datasets. Our results indicate that the classifiers are able to identify hypoxic tumor samples from bulk RNAseq and hypoxic regions within tumor from spatially resolved transcriptomics datasets. Moreover, application of the classifiers to histological sections from normal tissues suggest the presence of a hypoxic gene expression pattern in the kidney cortex not observed in other normoxic organs. Finally, tree classifiers described herein outperform traditional hypoxic gene signatures when compared against a wide range of datasets. This work describes a set of hypoxic gene signatures, structured as simple decision tress, that identify hypoxic samples and regions with high accuracy and can be applied to a broad variety of gene expression datasets and formats. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-022-04741-8.
format Online
Article
Text
id pubmed-9153107
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-91531072022-06-01 Hypoxia classifier for transcriptome datasets Puente-Santamaría, Laura Sanchez-Gonzalez, Lucia Ramos-Ruiz, Ricardo del Peso, Luis BMC Bioinformatics Research Molecular gene signatures are useful tools to characterize the physiological state of cell populations, but most have developed under a narrow range of conditions and cell types and are often restricted to a set of gene identities. Focusing on the transcriptional response to hypoxia, we aimed to generate widely applicable classifiers sourced from the results of a meta-analysis of 69 differential expression datasets which included 425 individual RNA-seq experiments from 33 different human cell types exposed to different degrees of hypoxia (0.1–5%[Formula: see text] ) for 2–48 h. The resulting decision trees include both gene identities and quantitative boundaries, allowing for easy classification of individual samples without control or normoxic reference. Each tree is composed of 3–5 genes mostly drawn from a small set of just 8 genes (EGLN1, MIR210HG, NDRG1, ANKRD37, TCAF2, PFKFB3, BHLHE40, and MAFF). In spite of their simplicity, these classifiers achieve over 95% accuracy in cross validation and over 80% accuracy when applied to additional challenging datasets. Our results indicate that the classifiers are able to identify hypoxic tumor samples from bulk RNAseq and hypoxic regions within tumor from spatially resolved transcriptomics datasets. Moreover, application of the classifiers to histological sections from normal tissues suggest the presence of a hypoxic gene expression pattern in the kidney cortex not observed in other normoxic organs. Finally, tree classifiers described herein outperform traditional hypoxic gene signatures when compared against a wide range of datasets. This work describes a set of hypoxic gene signatures, structured as simple decision tress, that identify hypoxic samples and regions with high accuracy and can be applied to a broad variety of gene expression datasets and formats. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-022-04741-8. BioMed Central 2022-05-31 /pmc/articles/PMC9153107/ /pubmed/35641902 http://dx.doi.org/10.1186/s12859-022-04741-8 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Puente-Santamaría, Laura
Sanchez-Gonzalez, Lucia
Ramos-Ruiz, Ricardo
del Peso, Luis
Hypoxia classifier for transcriptome datasets
title Hypoxia classifier for transcriptome datasets
title_full Hypoxia classifier for transcriptome datasets
title_fullStr Hypoxia classifier for transcriptome datasets
title_full_unstemmed Hypoxia classifier for transcriptome datasets
title_short Hypoxia classifier for transcriptome datasets
title_sort hypoxia classifier for transcriptome datasets
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9153107/
https://www.ncbi.nlm.nih.gov/pubmed/35641902
http://dx.doi.org/10.1186/s12859-022-04741-8
work_keys_str_mv AT puentesantamarialaura hypoxiaclassifierfortranscriptomedatasets
AT sanchezgonzalezlucia hypoxiaclassifierfortranscriptomedatasets
AT ramosruizricardo hypoxiaclassifierfortranscriptomedatasets
AT delpesoluis hypoxiaclassifierfortranscriptomedatasets