Cargando…

Machine learning modeling of genome-wide copy number alteration signatures reliably predicts IDH mutational status in adult diffuse glioma

Knowledge of 1p/19q-codeletion and IDH1/2 mutational status is necessary to interpret any investigational study of diffuse gliomas in the modern era. While DNA sequencing is the gold standard for determining IDH mutational status, genome-wide methylation arrays and gene expression profiling have bee...

Descripción completa

Detalles Bibliográficos
Autores principales: Nuechterlein, Nicholas, Shapiro, Linda G., Holland, Eric C., Cimino, Patrick J.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8645099/
https://www.ncbi.nlm.nih.gov/pubmed/34863298
http://dx.doi.org/10.1186/s40478-021-01295-3
_version_ 1784610238507253760
author Nuechterlein, Nicholas
Shapiro, Linda G.
Holland, Eric C.
Cimino, Patrick J.
author_facet Nuechterlein, Nicholas
Shapiro, Linda G.
Holland, Eric C.
Cimino, Patrick J.
author_sort Nuechterlein, Nicholas
collection PubMed
description Knowledge of 1p/19q-codeletion and IDH1/2 mutational status is necessary to interpret any investigational study of diffuse gliomas in the modern era. While DNA sequencing is the gold standard for determining IDH mutational status, genome-wide methylation arrays and gene expression profiling have been used for surrogate mutational determination. Previous studies by our group suggest that 1p/19q-codeletion and IDH mutational status can be predicted by genome-wide somatic copy number alteration (SCNA) data alone, however a rigorous model to accomplish this task has yet to be established. In this study, we used SCNA data from 786 adult diffuse gliomas in The Cancer Genome Atlas (TCGA) to develop a two-stage classification system that identifies 1p/19q-codeleted oligodendrogliomas and predicts the IDH mutational status of astrocytic tumors using a machine-learning model. Cross-validated results on TCGA SCNA data showed near perfect classification results. Furthermore, our astrocytic IDH mutation model validated well on four additional datasets (AUC = 0.97, AUC = 0.99, AUC = 0.95, AUC = 0.96) as did our 1p/19q-codeleted oligodendroglioma screen on the two datasets that contained oligodendrogliomas (MCC = 0.97, MCC = 0.97). We then retrained our system using data from these validation sets and applied our system to a cohort of REMBRANDT study subjects for whom SCNA data, but not IDH mutational status, is available. Overall, using genome-wide SCNAs, we successfully developed a system to robustly predict 1p/19q-codeletion and IDH mutational status in diffuse gliomas. This system can assign molecular subtype labels to tumor samples of retrospective diffuse glioma cohorts that lack 1p/19q-codeletion and IDH mutational status, such as the REMBRANDT study, recasting these datasets as validation cohorts for diffuse glioma research. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s40478-021-01295-3.
format Online
Article
Text
id pubmed-8645099
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-86450992021-12-06 Machine learning modeling of genome-wide copy number alteration signatures reliably predicts IDH mutational status in adult diffuse glioma Nuechterlein, Nicholas Shapiro, Linda G. Holland, Eric C. Cimino, Patrick J. Acta Neuropathol Commun Research Knowledge of 1p/19q-codeletion and IDH1/2 mutational status is necessary to interpret any investigational study of diffuse gliomas in the modern era. While DNA sequencing is the gold standard for determining IDH mutational status, genome-wide methylation arrays and gene expression profiling have been used for surrogate mutational determination. Previous studies by our group suggest that 1p/19q-codeletion and IDH mutational status can be predicted by genome-wide somatic copy number alteration (SCNA) data alone, however a rigorous model to accomplish this task has yet to be established. In this study, we used SCNA data from 786 adult diffuse gliomas in The Cancer Genome Atlas (TCGA) to develop a two-stage classification system that identifies 1p/19q-codeleted oligodendrogliomas and predicts the IDH mutational status of astrocytic tumors using a machine-learning model. Cross-validated results on TCGA SCNA data showed near perfect classification results. Furthermore, our astrocytic IDH mutation model validated well on four additional datasets (AUC = 0.97, AUC = 0.99, AUC = 0.95, AUC = 0.96) as did our 1p/19q-codeleted oligodendroglioma screen on the two datasets that contained oligodendrogliomas (MCC = 0.97, MCC = 0.97). We then retrained our system using data from these validation sets and applied our system to a cohort of REMBRANDT study subjects for whom SCNA data, but not IDH mutational status, is available. Overall, using genome-wide SCNAs, we successfully developed a system to robustly predict 1p/19q-codeletion and IDH mutational status in diffuse gliomas. This system can assign molecular subtype labels to tumor samples of retrospective diffuse glioma cohorts that lack 1p/19q-codeletion and IDH mutational status, such as the REMBRANDT study, recasting these datasets as validation cohorts for diffuse glioma research. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s40478-021-01295-3. BioMed Central 2021-12-04 /pmc/articles/PMC8645099/ /pubmed/34863298 http://dx.doi.org/10.1186/s40478-021-01295-3 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Nuechterlein, Nicholas
Shapiro, Linda G.
Holland, Eric C.
Cimino, Patrick J.
Machine learning modeling of genome-wide copy number alteration signatures reliably predicts IDH mutational status in adult diffuse glioma
title Machine learning modeling of genome-wide copy number alteration signatures reliably predicts IDH mutational status in adult diffuse glioma
title_full Machine learning modeling of genome-wide copy number alteration signatures reliably predicts IDH mutational status in adult diffuse glioma
title_fullStr Machine learning modeling of genome-wide copy number alteration signatures reliably predicts IDH mutational status in adult diffuse glioma
title_full_unstemmed Machine learning modeling of genome-wide copy number alteration signatures reliably predicts IDH mutational status in adult diffuse glioma
title_short Machine learning modeling of genome-wide copy number alteration signatures reliably predicts IDH mutational status in adult diffuse glioma
title_sort machine learning modeling of genome-wide copy number alteration signatures reliably predicts idh mutational status in adult diffuse glioma
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8645099/
https://www.ncbi.nlm.nih.gov/pubmed/34863298
http://dx.doi.org/10.1186/s40478-021-01295-3
work_keys_str_mv AT nuechterleinnicholas machinelearningmodelingofgenomewidecopynumberalterationsignaturesreliablypredictsidhmutationalstatusinadultdiffuseglioma
AT shapirolindag machinelearningmodelingofgenomewidecopynumberalterationsignaturesreliablypredictsidhmutationalstatusinadultdiffuseglioma
AT hollandericc machinelearningmodelingofgenomewidecopynumberalterationsignaturesreliablypredictsidhmutationalstatusinadultdiffuseglioma
AT ciminopatrickj machinelearningmodelingofgenomewidecopynumberalterationsignaturesreliablypredictsidhmutationalstatusinadultdiffuseglioma