Cargando…

Deciphering the Methylation Landscape in Breast Cancer: Diagnostic and Prognostic Biosignatures through Automated Machine Learning

SIMPLE SUMMARY: Breast cancer (BrCa) is characterized by aberrant DNA methylation. We leveraged high-throughput methylation data from BrCa and normal breast tissues and identified 11,176 to 27,786 differentially methylated genes (DMGs) against clinically relevant end-points. Innovative automated mac...

Descripción completa

Detalles Bibliográficos
Autores principales: Panagopoulou, Maria, Karaglani, Makrina, Manolopoulos, Vangelis G., Iliopoulos, Ioannis, Tsamardinos, Ioannis, Chatzaki, Ekaterini
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8037759/
https://www.ncbi.nlm.nih.gov/pubmed/33918195
http://dx.doi.org/10.3390/cancers13071677
_version_ 1783677218701967360
author Panagopoulou, Maria
Karaglani, Makrina
Manolopoulos, Vangelis G.
Iliopoulos, Ioannis
Tsamardinos, Ioannis
Chatzaki, Ekaterini
author_facet Panagopoulou, Maria
Karaglani, Makrina
Manolopoulos, Vangelis G.
Iliopoulos, Ioannis
Tsamardinos, Ioannis
Chatzaki, Ekaterini
author_sort Panagopoulou, Maria
collection PubMed
description SIMPLE SUMMARY: Breast cancer (BrCa) is characterized by aberrant DNA methylation. We leveraged high-throughput methylation data from BrCa and normal breast tissues and identified 11,176 to 27,786 differentially methylated genes (DMGs) against clinically relevant end-points. Innovative automated machine learning was employed to construct three highly performing signatures for (1) the discrimination of BrCa patients from healthy individuals, (2) the identification of BrCa metastatic disease and (3) the early diagnosis of BrCa. Furthermore, functional analysis revealed that most genes selected in the signatures showed associations to BrCa, with regulation of transcription being the main biological process, the nucleus being the main cellular component and transcription factor activity and sequence-specific DNA binding being the main molecular functions. Overall, revisiting methylome datasets led to three high-performance signatures that are readily available for improving BrCa precision management and significant knowledge mining related to disease pathophysiology. ABSTRACT: DNA methylation plays an important role in breast cancer (BrCa) pathogenesis and could contribute to driving its personalized management. We performed a complete bioinformatic analysis in BrCa whole methylome datasets, analyzed using the Illumina methylation 450 bead-chip array. Differential methylation analysis vs. clinical end-points resulted in 11,176 to 27,786 differentially methylated genes (DMGs). Innovative automated machine learning (AutoML) was employed to construct signatures with translational value. Three highly performing and low-feature-number signatures were built: (1) A 5-gene signature discriminating BrCa patients from healthy individuals (area under the curve (AUC): 0.994 (0.982–1.000)). (2) A 3-gene signature identifying BrCa metastatic disease (AUC: 0.986 (0.921–1.000)). (3) Six equivalent 5-gene signatures diagnosing early disease (AUC: 0.973 (0.920–1.000)). Validation in independent patient groups verified performance. Bioinformatic tools for functional analysis and protein interaction prediction were also employed. All protein encoding features included in the signatures were associated with BrCa-related pathways. Functional analysis of DMGs highlighted the regulation of transcription as the main biological process, the nucleus as the main cellular component and transcription factor activity and sequence-specific DNA binding as the main molecular functions. Overall, three high-performance diagnostic/prognostic signatures were built and are readily available for improving BrCa precision management upon prospective clinical validation. Revisiting archived methylomes through novel bioinformatic approaches revealed significant clarifying knowledge for the contribution of gene methylation events in breast carcinogenesis.
format Online
Article
Text
id pubmed-8037759
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-80377592021-04-12 Deciphering the Methylation Landscape in Breast Cancer: Diagnostic and Prognostic Biosignatures through Automated Machine Learning Panagopoulou, Maria Karaglani, Makrina Manolopoulos, Vangelis G. Iliopoulos, Ioannis Tsamardinos, Ioannis Chatzaki, Ekaterini Cancers (Basel) Article SIMPLE SUMMARY: Breast cancer (BrCa) is characterized by aberrant DNA methylation. We leveraged high-throughput methylation data from BrCa and normal breast tissues and identified 11,176 to 27,786 differentially methylated genes (DMGs) against clinically relevant end-points. Innovative automated machine learning was employed to construct three highly performing signatures for (1) the discrimination of BrCa patients from healthy individuals, (2) the identification of BrCa metastatic disease and (3) the early diagnosis of BrCa. Furthermore, functional analysis revealed that most genes selected in the signatures showed associations to BrCa, with regulation of transcription being the main biological process, the nucleus being the main cellular component and transcription factor activity and sequence-specific DNA binding being the main molecular functions. Overall, revisiting methylome datasets led to three high-performance signatures that are readily available for improving BrCa precision management and significant knowledge mining related to disease pathophysiology. ABSTRACT: DNA methylation plays an important role in breast cancer (BrCa) pathogenesis and could contribute to driving its personalized management. We performed a complete bioinformatic analysis in BrCa whole methylome datasets, analyzed using the Illumina methylation 450 bead-chip array. Differential methylation analysis vs. clinical end-points resulted in 11,176 to 27,786 differentially methylated genes (DMGs). Innovative automated machine learning (AutoML) was employed to construct signatures with translational value. Three highly performing and low-feature-number signatures were built: (1) A 5-gene signature discriminating BrCa patients from healthy individuals (area under the curve (AUC): 0.994 (0.982–1.000)). (2) A 3-gene signature identifying BrCa metastatic disease (AUC: 0.986 (0.921–1.000)). (3) Six equivalent 5-gene signatures diagnosing early disease (AUC: 0.973 (0.920–1.000)). Validation in independent patient groups verified performance. Bioinformatic tools for functional analysis and protein interaction prediction were also employed. All protein encoding features included in the signatures were associated with BrCa-related pathways. Functional analysis of DMGs highlighted the regulation of transcription as the main biological process, the nucleus as the main cellular component and transcription factor activity and sequence-specific DNA binding as the main molecular functions. Overall, three high-performance diagnostic/prognostic signatures were built and are readily available for improving BrCa precision management upon prospective clinical validation. Revisiting archived methylomes through novel bioinformatic approaches revealed significant clarifying knowledge for the contribution of gene methylation events in breast carcinogenesis. MDPI 2021-04-02 /pmc/articles/PMC8037759/ /pubmed/33918195 http://dx.doi.org/10.3390/cancers13071677 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Panagopoulou, Maria
Karaglani, Makrina
Manolopoulos, Vangelis G.
Iliopoulos, Ioannis
Tsamardinos, Ioannis
Chatzaki, Ekaterini
Deciphering the Methylation Landscape in Breast Cancer: Diagnostic and Prognostic Biosignatures through Automated Machine Learning
title Deciphering the Methylation Landscape in Breast Cancer: Diagnostic and Prognostic Biosignatures through Automated Machine Learning
title_full Deciphering the Methylation Landscape in Breast Cancer: Diagnostic and Prognostic Biosignatures through Automated Machine Learning
title_fullStr Deciphering the Methylation Landscape in Breast Cancer: Diagnostic and Prognostic Biosignatures through Automated Machine Learning
title_full_unstemmed Deciphering the Methylation Landscape in Breast Cancer: Diagnostic and Prognostic Biosignatures through Automated Machine Learning
title_short Deciphering the Methylation Landscape in Breast Cancer: Diagnostic and Prognostic Biosignatures through Automated Machine Learning
title_sort deciphering the methylation landscape in breast cancer: diagnostic and prognostic biosignatures through automated machine learning
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8037759/
https://www.ncbi.nlm.nih.gov/pubmed/33918195
http://dx.doi.org/10.3390/cancers13071677
work_keys_str_mv AT panagopouloumaria decipheringthemethylationlandscapeinbreastcancerdiagnosticandprognosticbiosignaturesthroughautomatedmachinelearning
AT karaglanimakrina decipheringthemethylationlandscapeinbreastcancerdiagnosticandprognosticbiosignaturesthroughautomatedmachinelearning
AT manolopoulosvangelisg decipheringthemethylationlandscapeinbreastcancerdiagnosticandprognosticbiosignaturesthroughautomatedmachinelearning
AT iliopoulosioannis decipheringthemethylationlandscapeinbreastcancerdiagnosticandprognosticbiosignaturesthroughautomatedmachinelearning
AT tsamardinosioannis decipheringthemethylationlandscapeinbreastcancerdiagnosticandprognosticbiosignaturesthroughautomatedmachinelearning
AT chatzakiekaterini decipheringthemethylationlandscapeinbreastcancerdiagnosticandprognosticbiosignaturesthroughautomatedmachinelearning