Cargando…
Comprehensive evaluation of the implementation of episignatures for diagnosis of neurodevelopmental disorders (NDDs)
Episignatures are popular tools for the diagnosis of rare neurodevelopmental disorders. They are commonly based on a set of differentially methylated CpGs used in combination with a support vector machine model. DNA methylation (DNAm) data often include missing values due to changes in data generati...
Autores principales: | , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Springer Berlin Heidelberg
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10676303/ https://www.ncbi.nlm.nih.gov/pubmed/37889307 http://dx.doi.org/10.1007/s00439-023-02609-2 |
_version_ | 1785141251406823424 |
---|---|
author | Giuili, Edoardo Grolaux, Robin Macedo, Catarina Z. N. M. Desmyter, Laurence Pichon, Bruno Neuens, Sebastian Vilain, Catheline Olsen, Catharina Van Dooren, Sonia Smits, Guillaume Defrance, Matthieu |
author_facet | Giuili, Edoardo Grolaux, Robin Macedo, Catarina Z. N. M. Desmyter, Laurence Pichon, Bruno Neuens, Sebastian Vilain, Catheline Olsen, Catharina Van Dooren, Sonia Smits, Guillaume Defrance, Matthieu |
author_sort | Giuili, Edoardo |
collection | PubMed |
description | Episignatures are popular tools for the diagnosis of rare neurodevelopmental disorders. They are commonly based on a set of differentially methylated CpGs used in combination with a support vector machine model. DNA methylation (DNAm) data often include missing values due to changes in data generation technology and batch effects. While many normalization methods exist for DNAm data, their impact on episignature performance have never been assessed. In addition, technologies to quantify DNAm evolve quickly and this may lead to poor transposition of existing episignatures generated on deprecated array versions to new ones. Indeed, probe removal between array versions, technologies or during preprocessing leads to missing values. Thus, the effect of missing data on episignature performance must also be carefully evaluated and addressed through imputation or an innovative approach to episignatures design. In this paper, we used data from patients suffering from Kabuki and Sotos syndrome to evaluate the influence of normalization methods, classification models and missing data on the prediction performances of two existing episignatures. We compare how six popular normalization methods for methylarray data affect episignature classification performances in Kabuki and Sotos syndromes and provide best practice suggestions when building new episignatures. In this setting, we show that Illumina, Noob or Funnorm normalization methods achieved higher classification performances on the testing sets compared to Quantile, Raw and Swan normalization methods. We further show that penalized logistic regression and support vector machines perform best in the classification of Kabuki and Sotos syndrome patients. Then, we describe a new paradigm to build episignatures based on the detection of differentially methylated regions (DMRs) and evaluate their performance compared to classical differentially methylated cytosines (DMCs)-based episignatures in the presence of missing data. We show that the performance of classical DMC-based episignatures suffers from the presence of missing data more than the DMR-based approach. We present a comprehensive evaluation of how the normalization of DNA methylation data affects episignature performance, using three popular classification models. We further evaluate how missing data affect those models’ predictions. Finally, we propose a novel methodology to develop episignatures based on differentially methylated regions identification and show how this method slightly outperforms classical episignatures in the presence of missing data. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1007/s00439-023-02609-2. |
format | Online Article Text |
id | pubmed-10676303 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Springer Berlin Heidelberg |
record_format | MEDLINE/PubMed |
spelling | pubmed-106763032023-10-27 Comprehensive evaluation of the implementation of episignatures for diagnosis of neurodevelopmental disorders (NDDs) Giuili, Edoardo Grolaux, Robin Macedo, Catarina Z. N. M. Desmyter, Laurence Pichon, Bruno Neuens, Sebastian Vilain, Catheline Olsen, Catharina Van Dooren, Sonia Smits, Guillaume Defrance, Matthieu Hum Genet Original Investigation Episignatures are popular tools for the diagnosis of rare neurodevelopmental disorders. They are commonly based on a set of differentially methylated CpGs used in combination with a support vector machine model. DNA methylation (DNAm) data often include missing values due to changes in data generation technology and batch effects. While many normalization methods exist for DNAm data, their impact on episignature performance have never been assessed. In addition, technologies to quantify DNAm evolve quickly and this may lead to poor transposition of existing episignatures generated on deprecated array versions to new ones. Indeed, probe removal between array versions, technologies or during preprocessing leads to missing values. Thus, the effect of missing data on episignature performance must also be carefully evaluated and addressed through imputation or an innovative approach to episignatures design. In this paper, we used data from patients suffering from Kabuki and Sotos syndrome to evaluate the influence of normalization methods, classification models and missing data on the prediction performances of two existing episignatures. We compare how six popular normalization methods for methylarray data affect episignature classification performances in Kabuki and Sotos syndromes and provide best practice suggestions when building new episignatures. In this setting, we show that Illumina, Noob or Funnorm normalization methods achieved higher classification performances on the testing sets compared to Quantile, Raw and Swan normalization methods. We further show that penalized logistic regression and support vector machines perform best in the classification of Kabuki and Sotos syndrome patients. Then, we describe a new paradigm to build episignatures based on the detection of differentially methylated regions (DMRs) and evaluate their performance compared to classical differentially methylated cytosines (DMCs)-based episignatures in the presence of missing data. We show that the performance of classical DMC-based episignatures suffers from the presence of missing data more than the DMR-based approach. We present a comprehensive evaluation of how the normalization of DNA methylation data affects episignature performance, using three popular classification models. We further evaluate how missing data affect those models’ predictions. Finally, we propose a novel methodology to develop episignatures based on differentially methylated regions identification and show how this method slightly outperforms classical episignatures in the presence of missing data. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1007/s00439-023-02609-2. Springer Berlin Heidelberg 2023-10-27 2023 /pmc/articles/PMC10676303/ /pubmed/37889307 http://dx.doi.org/10.1007/s00439-023-02609-2 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Original Investigation Giuili, Edoardo Grolaux, Robin Macedo, Catarina Z. N. M. Desmyter, Laurence Pichon, Bruno Neuens, Sebastian Vilain, Catheline Olsen, Catharina Van Dooren, Sonia Smits, Guillaume Defrance, Matthieu Comprehensive evaluation of the implementation of episignatures for diagnosis of neurodevelopmental disorders (NDDs) |
title | Comprehensive evaluation of the implementation of episignatures for diagnosis of neurodevelopmental disorders (NDDs) |
title_full | Comprehensive evaluation of the implementation of episignatures for diagnosis of neurodevelopmental disorders (NDDs) |
title_fullStr | Comprehensive evaluation of the implementation of episignatures for diagnosis of neurodevelopmental disorders (NDDs) |
title_full_unstemmed | Comprehensive evaluation of the implementation of episignatures for diagnosis of neurodevelopmental disorders (NDDs) |
title_short | Comprehensive evaluation of the implementation of episignatures for diagnosis of neurodevelopmental disorders (NDDs) |
title_sort | comprehensive evaluation of the implementation of episignatures for diagnosis of neurodevelopmental disorders (ndds) |
topic | Original Investigation |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10676303/ https://www.ncbi.nlm.nih.gov/pubmed/37889307 http://dx.doi.org/10.1007/s00439-023-02609-2 |
work_keys_str_mv | AT giuiliedoardo comprehensiveevaluationoftheimplementationofepisignaturesfordiagnosisofneurodevelopmentaldisordersndds AT grolauxrobin comprehensiveevaluationoftheimplementationofepisignaturesfordiagnosisofneurodevelopmentaldisordersndds AT macedocatarinaznm comprehensiveevaluationoftheimplementationofepisignaturesfordiagnosisofneurodevelopmentaldisordersndds AT desmyterlaurence comprehensiveevaluationoftheimplementationofepisignaturesfordiagnosisofneurodevelopmentaldisordersndds AT pichonbruno comprehensiveevaluationoftheimplementationofepisignaturesfordiagnosisofneurodevelopmentaldisordersndds AT neuenssebastian comprehensiveevaluationoftheimplementationofepisignaturesfordiagnosisofneurodevelopmentaldisordersndds AT vilaincatheline comprehensiveevaluationoftheimplementationofepisignaturesfordiagnosisofneurodevelopmentaldisordersndds AT olsencatharina comprehensiveevaluationoftheimplementationofepisignaturesfordiagnosisofneurodevelopmentaldisordersndds AT vandoorensonia comprehensiveevaluationoftheimplementationofepisignaturesfordiagnosisofneurodevelopmentaldisordersndds AT smitsguillaume comprehensiveevaluationoftheimplementationofepisignaturesfordiagnosisofneurodevelopmentaldisordersndds AT defrancematthieu comprehensiveevaluationoftheimplementationofepisignaturesfordiagnosisofneurodevelopmentaldisordersndds |