Cargando…
Hypoxia classifier for transcriptome datasets
Molecular gene signatures are useful tools to characterize the physiological state of cell populations, but most have developed under a narrow range of conditions and cell types and are often restricted to a set of gene identities. Focusing on the transcriptional response to hypoxia, we aimed to gen...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9153107/ https://www.ncbi.nlm.nih.gov/pubmed/35641902 http://dx.doi.org/10.1186/s12859-022-04741-8 |
_version_ | 1784717780929478656 |
---|---|
author | Puente-Santamaría, Laura Sanchez-Gonzalez, Lucia Ramos-Ruiz, Ricardo del Peso, Luis |
author_facet | Puente-Santamaría, Laura Sanchez-Gonzalez, Lucia Ramos-Ruiz, Ricardo del Peso, Luis |
author_sort | Puente-Santamaría, Laura |
collection | PubMed |
description | Molecular gene signatures are useful tools to characterize the physiological state of cell populations, but most have developed under a narrow range of conditions and cell types and are often restricted to a set of gene identities. Focusing on the transcriptional response to hypoxia, we aimed to generate widely applicable classifiers sourced from the results of a meta-analysis of 69 differential expression datasets which included 425 individual RNA-seq experiments from 33 different human cell types exposed to different degrees of hypoxia (0.1–5%[Formula: see text] ) for 2–48 h. The resulting decision trees include both gene identities and quantitative boundaries, allowing for easy classification of individual samples without control or normoxic reference. Each tree is composed of 3–5 genes mostly drawn from a small set of just 8 genes (EGLN1, MIR210HG, NDRG1, ANKRD37, TCAF2, PFKFB3, BHLHE40, and MAFF). In spite of their simplicity, these classifiers achieve over 95% accuracy in cross validation and over 80% accuracy when applied to additional challenging datasets. Our results indicate that the classifiers are able to identify hypoxic tumor samples from bulk RNAseq and hypoxic regions within tumor from spatially resolved transcriptomics datasets. Moreover, application of the classifiers to histological sections from normal tissues suggest the presence of a hypoxic gene expression pattern in the kidney cortex not observed in other normoxic organs. Finally, tree classifiers described herein outperform traditional hypoxic gene signatures when compared against a wide range of datasets. This work describes a set of hypoxic gene signatures, structured as simple decision tress, that identify hypoxic samples and regions with high accuracy and can be applied to a broad variety of gene expression datasets and formats. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-022-04741-8. |
format | Online Article Text |
id | pubmed-9153107 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-91531072022-06-01 Hypoxia classifier for transcriptome datasets Puente-Santamaría, Laura Sanchez-Gonzalez, Lucia Ramos-Ruiz, Ricardo del Peso, Luis BMC Bioinformatics Research Molecular gene signatures are useful tools to characterize the physiological state of cell populations, but most have developed under a narrow range of conditions and cell types and are often restricted to a set of gene identities. Focusing on the transcriptional response to hypoxia, we aimed to generate widely applicable classifiers sourced from the results of a meta-analysis of 69 differential expression datasets which included 425 individual RNA-seq experiments from 33 different human cell types exposed to different degrees of hypoxia (0.1–5%[Formula: see text] ) for 2–48 h. The resulting decision trees include both gene identities and quantitative boundaries, allowing for easy classification of individual samples without control or normoxic reference. Each tree is composed of 3–5 genes mostly drawn from a small set of just 8 genes (EGLN1, MIR210HG, NDRG1, ANKRD37, TCAF2, PFKFB3, BHLHE40, and MAFF). In spite of their simplicity, these classifiers achieve over 95% accuracy in cross validation and over 80% accuracy when applied to additional challenging datasets. Our results indicate that the classifiers are able to identify hypoxic tumor samples from bulk RNAseq and hypoxic regions within tumor from spatially resolved transcriptomics datasets. Moreover, application of the classifiers to histological sections from normal tissues suggest the presence of a hypoxic gene expression pattern in the kidney cortex not observed in other normoxic organs. Finally, tree classifiers described herein outperform traditional hypoxic gene signatures when compared against a wide range of datasets. This work describes a set of hypoxic gene signatures, structured as simple decision tress, that identify hypoxic samples and regions with high accuracy and can be applied to a broad variety of gene expression datasets and formats. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-022-04741-8. BioMed Central 2022-05-31 /pmc/articles/PMC9153107/ /pubmed/35641902 http://dx.doi.org/10.1186/s12859-022-04741-8 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Puente-Santamaría, Laura Sanchez-Gonzalez, Lucia Ramos-Ruiz, Ricardo del Peso, Luis Hypoxia classifier for transcriptome datasets |
title | Hypoxia classifier for transcriptome datasets |
title_full | Hypoxia classifier for transcriptome datasets |
title_fullStr | Hypoxia classifier for transcriptome datasets |
title_full_unstemmed | Hypoxia classifier for transcriptome datasets |
title_short | Hypoxia classifier for transcriptome datasets |
title_sort | hypoxia classifier for transcriptome datasets |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9153107/ https://www.ncbi.nlm.nih.gov/pubmed/35641902 http://dx.doi.org/10.1186/s12859-022-04741-8 |
work_keys_str_mv | AT puentesantamarialaura hypoxiaclassifierfortranscriptomedatasets AT sanchezgonzalezlucia hypoxiaclassifierfortranscriptomedatasets AT ramosruizricardo hypoxiaclassifierfortranscriptomedatasets AT delpesoluis hypoxiaclassifierfortranscriptomedatasets |