Cargando…
Machine learning modeling of genome-wide copy number alteration signatures reliably predicts IDH mutational status in adult diffuse glioma
Knowledge of 1p/19q-codeletion and IDH1/2 mutational status is necessary to interpret any investigational study of diffuse gliomas in the modern era. While DNA sequencing is the gold standard for determining IDH mutational status, genome-wide methylation arrays and gene expression profiling have bee...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8645099/ https://www.ncbi.nlm.nih.gov/pubmed/34863298 http://dx.doi.org/10.1186/s40478-021-01295-3 |
_version_ | 1784610238507253760 |
---|---|
author | Nuechterlein, Nicholas Shapiro, Linda G. Holland, Eric C. Cimino, Patrick J. |
author_facet | Nuechterlein, Nicholas Shapiro, Linda G. Holland, Eric C. Cimino, Patrick J. |
author_sort | Nuechterlein, Nicholas |
collection | PubMed |
description | Knowledge of 1p/19q-codeletion and IDH1/2 mutational status is necessary to interpret any investigational study of diffuse gliomas in the modern era. While DNA sequencing is the gold standard for determining IDH mutational status, genome-wide methylation arrays and gene expression profiling have been used for surrogate mutational determination. Previous studies by our group suggest that 1p/19q-codeletion and IDH mutational status can be predicted by genome-wide somatic copy number alteration (SCNA) data alone, however a rigorous model to accomplish this task has yet to be established. In this study, we used SCNA data from 786 adult diffuse gliomas in The Cancer Genome Atlas (TCGA) to develop a two-stage classification system that identifies 1p/19q-codeleted oligodendrogliomas and predicts the IDH mutational status of astrocytic tumors using a machine-learning model. Cross-validated results on TCGA SCNA data showed near perfect classification results. Furthermore, our astrocytic IDH mutation model validated well on four additional datasets (AUC = 0.97, AUC = 0.99, AUC = 0.95, AUC = 0.96) as did our 1p/19q-codeleted oligodendroglioma screen on the two datasets that contained oligodendrogliomas (MCC = 0.97, MCC = 0.97). We then retrained our system using data from these validation sets and applied our system to a cohort of REMBRANDT study subjects for whom SCNA data, but not IDH mutational status, is available. Overall, using genome-wide SCNAs, we successfully developed a system to robustly predict 1p/19q-codeletion and IDH mutational status in diffuse gliomas. This system can assign molecular subtype labels to tumor samples of retrospective diffuse glioma cohorts that lack 1p/19q-codeletion and IDH mutational status, such as the REMBRANDT study, recasting these datasets as validation cohorts for diffuse glioma research. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s40478-021-01295-3. |
format | Online Article Text |
id | pubmed-8645099 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-86450992021-12-06 Machine learning modeling of genome-wide copy number alteration signatures reliably predicts IDH mutational status in adult diffuse glioma Nuechterlein, Nicholas Shapiro, Linda G. Holland, Eric C. Cimino, Patrick J. Acta Neuropathol Commun Research Knowledge of 1p/19q-codeletion and IDH1/2 mutational status is necessary to interpret any investigational study of diffuse gliomas in the modern era. While DNA sequencing is the gold standard for determining IDH mutational status, genome-wide methylation arrays and gene expression profiling have been used for surrogate mutational determination. Previous studies by our group suggest that 1p/19q-codeletion and IDH mutational status can be predicted by genome-wide somatic copy number alteration (SCNA) data alone, however a rigorous model to accomplish this task has yet to be established. In this study, we used SCNA data from 786 adult diffuse gliomas in The Cancer Genome Atlas (TCGA) to develop a two-stage classification system that identifies 1p/19q-codeleted oligodendrogliomas and predicts the IDH mutational status of astrocytic tumors using a machine-learning model. Cross-validated results on TCGA SCNA data showed near perfect classification results. Furthermore, our astrocytic IDH mutation model validated well on four additional datasets (AUC = 0.97, AUC = 0.99, AUC = 0.95, AUC = 0.96) as did our 1p/19q-codeleted oligodendroglioma screen on the two datasets that contained oligodendrogliomas (MCC = 0.97, MCC = 0.97). We then retrained our system using data from these validation sets and applied our system to a cohort of REMBRANDT study subjects for whom SCNA data, but not IDH mutational status, is available. Overall, using genome-wide SCNAs, we successfully developed a system to robustly predict 1p/19q-codeletion and IDH mutational status in diffuse gliomas. This system can assign molecular subtype labels to tumor samples of retrospective diffuse glioma cohorts that lack 1p/19q-codeletion and IDH mutational status, such as the REMBRANDT study, recasting these datasets as validation cohorts for diffuse glioma research. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s40478-021-01295-3. BioMed Central 2021-12-04 /pmc/articles/PMC8645099/ /pubmed/34863298 http://dx.doi.org/10.1186/s40478-021-01295-3 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Nuechterlein, Nicholas Shapiro, Linda G. Holland, Eric C. Cimino, Patrick J. Machine learning modeling of genome-wide copy number alteration signatures reliably predicts IDH mutational status in adult diffuse glioma |
title | Machine learning modeling of genome-wide copy number alteration signatures reliably predicts IDH mutational status in adult diffuse glioma |
title_full | Machine learning modeling of genome-wide copy number alteration signatures reliably predicts IDH mutational status in adult diffuse glioma |
title_fullStr | Machine learning modeling of genome-wide copy number alteration signatures reliably predicts IDH mutational status in adult diffuse glioma |
title_full_unstemmed | Machine learning modeling of genome-wide copy number alteration signatures reliably predicts IDH mutational status in adult diffuse glioma |
title_short | Machine learning modeling of genome-wide copy number alteration signatures reliably predicts IDH mutational status in adult diffuse glioma |
title_sort | machine learning modeling of genome-wide copy number alteration signatures reliably predicts idh mutational status in adult diffuse glioma |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8645099/ https://www.ncbi.nlm.nih.gov/pubmed/34863298 http://dx.doi.org/10.1186/s40478-021-01295-3 |
work_keys_str_mv | AT nuechterleinnicholas machinelearningmodelingofgenomewidecopynumberalterationsignaturesreliablypredictsidhmutationalstatusinadultdiffuseglioma AT shapirolindag machinelearningmodelingofgenomewidecopynumberalterationsignaturesreliablypredictsidhmutationalstatusinadultdiffuseglioma AT hollandericc machinelearningmodelingofgenomewidecopynumberalterationsignaturesreliablypredictsidhmutationalstatusinadultdiffuseglioma AT ciminopatrickj machinelearningmodelingofgenomewidecopynumberalterationsignaturesreliablypredictsidhmutationalstatusinadultdiffuseglioma |