Cargando…

Development of a Machine Learning Classifier for Brain Tumors Diagnosis Based on DNA Methylation Profile

Background: More than 150 types of brain tumors have been documented. Accurate diagnosis is important for making appropriate therapeutic decisions in treating the diseases. The goal of this study is to develop a DNA methylation profile-based classifier to accurately identify various kinds of brain t...

Descripción completa

Detalles Bibliográficos
Autores principales: Chen, Yuxing, Yan, Yixin, Xu, Moping, Chen, Wen, Lin, Jinyu, Zhao, Yan, Wu, Junze, Wang, Xianlong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9581020/
https://www.ncbi.nlm.nih.gov/pubmed/36303797
http://dx.doi.org/10.3389/fbinf.2021.744345
_version_ 1784812523873107968
author Chen, Yuxing
Yan, Yixin
Xu, Moping
Chen, Wen
Lin, Jinyu
Zhao, Yan
Wu, Junze
Wang, Xianlong
author_facet Chen, Yuxing
Yan, Yixin
Xu, Moping
Chen, Wen
Lin, Jinyu
Zhao, Yan
Wu, Junze
Wang, Xianlong
author_sort Chen, Yuxing
collection PubMed
description Background: More than 150 types of brain tumors have been documented. Accurate diagnosis is important for making appropriate therapeutic decisions in treating the diseases. The goal of this study is to develop a DNA methylation profile-based classifier to accurately identify various kinds of brain tumors. Methods: Thirteen datasets of DNA methylation profiles were downloaded from the Gene Expression Omnibus (GEO) database, of which GSE90496 and GSE109379 were used as the training set and the validation set, respectively, and the remaining 11 sets were used as the independent test set. The random forest algorithm was used to select the CpG sites based on the importance of the features and a multilayer perceptron (MLP) model was trained to classify the samples. Deconvolution with the debCAM package was used to explore the cellular composition difference among tumors. Results: From training datasets with 2,801 samples, 396,568 CpG sites were retained after preprocessing, of which 767 were selected as the modeling features. A three-layer MLP model was developed, which consists of 1,320 nodes in the hidden layer, to predict the histological types of brain tumors. The prediction accuracy is 99.2, 87.0, and 96.58%, respectively, on the training, validation and test sets. The results of deconvolution analysis showed that the cell proportions of different tumor subtypes were different, and it is approximately enough to distinguish different tumor entities. Conclusion: We developed a classifier that is robust for the classification of central nervous system tumors, and tried to analyze the reasons for the classification performance.
format Online
Article
Text
id pubmed-9581020
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-95810202022-10-26 Development of a Machine Learning Classifier for Brain Tumors Diagnosis Based on DNA Methylation Profile Chen, Yuxing Yan, Yixin Xu, Moping Chen, Wen Lin, Jinyu Zhao, Yan Wu, Junze Wang, Xianlong Front Bioinform Bioinformatics Background: More than 150 types of brain tumors have been documented. Accurate diagnosis is important for making appropriate therapeutic decisions in treating the diseases. The goal of this study is to develop a DNA methylation profile-based classifier to accurately identify various kinds of brain tumors. Methods: Thirteen datasets of DNA methylation profiles were downloaded from the Gene Expression Omnibus (GEO) database, of which GSE90496 and GSE109379 were used as the training set and the validation set, respectively, and the remaining 11 sets were used as the independent test set. The random forest algorithm was used to select the CpG sites based on the importance of the features and a multilayer perceptron (MLP) model was trained to classify the samples. Deconvolution with the debCAM package was used to explore the cellular composition difference among tumors. Results: From training datasets with 2,801 samples, 396,568 CpG sites were retained after preprocessing, of which 767 were selected as the modeling features. A three-layer MLP model was developed, which consists of 1,320 nodes in the hidden layer, to predict the histological types of brain tumors. The prediction accuracy is 99.2, 87.0, and 96.58%, respectively, on the training, validation and test sets. The results of deconvolution analysis showed that the cell proportions of different tumor subtypes were different, and it is approximately enough to distinguish different tumor entities. Conclusion: We developed a classifier that is robust for the classification of central nervous system tumors, and tried to analyze the reasons for the classification performance. Frontiers Media S.A. 2021-11-08 /pmc/articles/PMC9581020/ /pubmed/36303797 http://dx.doi.org/10.3389/fbinf.2021.744345 Text en Copyright © 2021 Chen, Yan, Xu, Chen, Lin, Zhao, Wu and Wang. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Bioinformatics
Chen, Yuxing
Yan, Yixin
Xu, Moping
Chen, Wen
Lin, Jinyu
Zhao, Yan
Wu, Junze
Wang, Xianlong
Development of a Machine Learning Classifier for Brain Tumors Diagnosis Based on DNA Methylation Profile
title Development of a Machine Learning Classifier for Brain Tumors Diagnosis Based on DNA Methylation Profile
title_full Development of a Machine Learning Classifier for Brain Tumors Diagnosis Based on DNA Methylation Profile
title_fullStr Development of a Machine Learning Classifier for Brain Tumors Diagnosis Based on DNA Methylation Profile
title_full_unstemmed Development of a Machine Learning Classifier for Brain Tumors Diagnosis Based on DNA Methylation Profile
title_short Development of a Machine Learning Classifier for Brain Tumors Diagnosis Based on DNA Methylation Profile
title_sort development of a machine learning classifier for brain tumors diagnosis based on dna methylation profile
topic Bioinformatics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9581020/
https://www.ncbi.nlm.nih.gov/pubmed/36303797
http://dx.doi.org/10.3389/fbinf.2021.744345
work_keys_str_mv AT chenyuxing developmentofamachinelearningclassifierforbraintumorsdiagnosisbasedondnamethylationprofile
AT yanyixin developmentofamachinelearningclassifierforbraintumorsdiagnosisbasedondnamethylationprofile
AT xumoping developmentofamachinelearningclassifierforbraintumorsdiagnosisbasedondnamethylationprofile
AT chenwen developmentofamachinelearningclassifierforbraintumorsdiagnosisbasedondnamethylationprofile
AT linjinyu developmentofamachinelearningclassifierforbraintumorsdiagnosisbasedondnamethylationprofile
AT zhaoyan developmentofamachinelearningclassifierforbraintumorsdiagnosisbasedondnamethylationprofile
AT wujunze developmentofamachinelearningclassifierforbraintumorsdiagnosisbasedondnamethylationprofile
AT wangxianlong developmentofamachinelearningclassifierforbraintumorsdiagnosisbasedondnamethylationprofile