Cargando…
meth-SemiCancer: a cancer subtype classification framework via semi-supervised learning utilizing DNA methylation profiles
BACKGROUND: Identification of the cancer subtype plays a crucial role to provide an accurate diagnosis and proper treatment to improve the clinical outcomes of patients. Recent studies have shown that DNA methylation is one of the key factors for tumorigenesis and tumor growth, where the DNA methyla...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10131478/ https://www.ncbi.nlm.nih.gov/pubmed/37101254 http://dx.doi.org/10.1186/s12859-023-05272-6 |
_version_ | 1785031186664390656 |
---|---|
author | Choi, Joung Min Park, Chaelin Chae, Heejoon |
author_facet | Choi, Joung Min Park, Chaelin Chae, Heejoon |
author_sort | Choi, Joung Min |
collection | PubMed |
description | BACKGROUND: Identification of the cancer subtype plays a crucial role to provide an accurate diagnosis and proper treatment to improve the clinical outcomes of patients. Recent studies have shown that DNA methylation is one of the key factors for tumorigenesis and tumor growth, where the DNA methylation signatures have the potential to be utilized as cancer subtype-specific markers. However, due to the high dimensionality and the low number of DNA methylome cancer samples with the subtype information, still, to date, a cancer subtype classification method utilizing DNA methylome datasets has not been proposed. RESULTS: In this paper, we present meth-SemiCancer, a semi-supervised cancer subtype classification framework based on DNA methylation profiles. The proposed model was first pre-trained based on the methylation datasets with the cancer subtype labels. After that, meth-SemiCancer generated the pseudo-subtypes for the cancer datasets without subtype information based on the model’s prediction. Finally, fine-tuning was performed utilizing both the labeled and unlabeled datasets. CONCLUSIONS: From the performance comparison with the standard machine learning-based classifiers, meth-SemiCancer achieved the highest average F1-score and Matthews correlation coefficient, outperforming other methods. Fine-tuning the model with the unlabeled patient samples by providing the proper pseudo-subtypes, encouraged meth-SemiCancer to generalize better than the supervised neural network-based subtype classification method. meth-SemiCancer is publicly available at https://github.com/cbi-bioinfo/meth-SemiCancer. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-023-05272-6. |
format | Online Article Text |
id | pubmed-10131478 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-101314782023-04-27 meth-SemiCancer: a cancer subtype classification framework via semi-supervised learning utilizing DNA methylation profiles Choi, Joung Min Park, Chaelin Chae, Heejoon BMC Bioinformatics Research BACKGROUND: Identification of the cancer subtype plays a crucial role to provide an accurate diagnosis and proper treatment to improve the clinical outcomes of patients. Recent studies have shown that DNA methylation is one of the key factors for tumorigenesis and tumor growth, where the DNA methylation signatures have the potential to be utilized as cancer subtype-specific markers. However, due to the high dimensionality and the low number of DNA methylome cancer samples with the subtype information, still, to date, a cancer subtype classification method utilizing DNA methylome datasets has not been proposed. RESULTS: In this paper, we present meth-SemiCancer, a semi-supervised cancer subtype classification framework based on DNA methylation profiles. The proposed model was first pre-trained based on the methylation datasets with the cancer subtype labels. After that, meth-SemiCancer generated the pseudo-subtypes for the cancer datasets without subtype information based on the model’s prediction. Finally, fine-tuning was performed utilizing both the labeled and unlabeled datasets. CONCLUSIONS: From the performance comparison with the standard machine learning-based classifiers, meth-SemiCancer achieved the highest average F1-score and Matthews correlation coefficient, outperforming other methods. Fine-tuning the model with the unlabeled patient samples by providing the proper pseudo-subtypes, encouraged meth-SemiCancer to generalize better than the supervised neural network-based subtype classification method. meth-SemiCancer is publicly available at https://github.com/cbi-bioinfo/meth-SemiCancer. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-023-05272-6. BioMed Central 2023-04-26 /pmc/articles/PMC10131478/ /pubmed/37101254 http://dx.doi.org/10.1186/s12859-023-05272-6 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Choi, Joung Min Park, Chaelin Chae, Heejoon meth-SemiCancer: a cancer subtype classification framework via semi-supervised learning utilizing DNA methylation profiles |
title | meth-SemiCancer: a cancer subtype classification framework via semi-supervised learning utilizing DNA methylation profiles |
title_full | meth-SemiCancer: a cancer subtype classification framework via semi-supervised learning utilizing DNA methylation profiles |
title_fullStr | meth-SemiCancer: a cancer subtype classification framework via semi-supervised learning utilizing DNA methylation profiles |
title_full_unstemmed | meth-SemiCancer: a cancer subtype classification framework via semi-supervised learning utilizing DNA methylation profiles |
title_short | meth-SemiCancer: a cancer subtype classification framework via semi-supervised learning utilizing DNA methylation profiles |
title_sort | meth-semicancer: a cancer subtype classification framework via semi-supervised learning utilizing dna methylation profiles |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10131478/ https://www.ncbi.nlm.nih.gov/pubmed/37101254 http://dx.doi.org/10.1186/s12859-023-05272-6 |
work_keys_str_mv | AT choijoungmin methsemicanceracancersubtypeclassificationframeworkviasemisupervisedlearningutilizingdnamethylationprofiles AT parkchaelin methsemicanceracancersubtypeclassificationframeworkviasemisupervisedlearningutilizingdnamethylationprofiles AT chaeheejoon methsemicanceracancersubtypeclassificationframeworkviasemisupervisedlearningutilizingdnamethylationprofiles |