Cargando…
Integration of pan-cancer multi-omics data for novel mixed subgroup identification using machine learning methods
Cancer is a heterogeneous disease, and patients with tumors from different organs can share similar epigenetic and genetic alterations. Therefore, it is crucial to identify the novel subgroups of patients with similar molecular characteristics. It is possible to propose a better treatment strategy w...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10586677/ https://www.ncbi.nlm.nih.gov/pubmed/37856446 http://dx.doi.org/10.1371/journal.pone.0287176 |
_version_ | 1785123196880551936 |
---|---|
author | Khadirnaikar, Seema Shukla, Sudhanshu Prasanna, S. R. M. |
author_facet | Khadirnaikar, Seema Shukla, Sudhanshu Prasanna, S. R. M. |
author_sort | Khadirnaikar, Seema |
collection | PubMed |
description | Cancer is a heterogeneous disease, and patients with tumors from different organs can share similar epigenetic and genetic alterations. Therefore, it is crucial to identify the novel subgroups of patients with similar molecular characteristics. It is possible to propose a better treatment strategy when the heterogeneity of the patient is accounted for during subgroup identification, irrespective of the tissue of origin. This work proposes a machine learning (ML) based pipeline for subgroup identification in pan-cancer. Here, mRNA, miRNA, DNA methylation, and protein expression features from pan-cancer samples were concatenated and non-linearly projected to a lower dimension using an ML algorithm. This data was then clustered to identify multi-omics-based novel subgroups. The clinical characterization of these ML subgroups indicated significant differences in overall survival (OS) and disease-free survival (DFS) (p-value<0.0001). The subgroups formed by the patients from different tumors shared similar molecular alterations in terms of immune microenvironment, mutation profile, and enriched pathways. Further, decision-level and feature-level fused classification models were built to identify the novel subgroups for unseen samples. Additionally, the classification models were used to obtain the class labels for the validation samples, and the molecular characteristics were verified. To summarize, this work identified novel ML subgroups using multi-omics data and showed that the patients with different tumor types could be similar molecularly. We also proposed and validated the classification models for subgroup identification. The proposed classification models can be used to identify the novel multi-omics subgroups, and the molecular characteristics of each subgroup can be used to design appropriate treatment regimen. |
format | Online Article Text |
id | pubmed-10586677 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-105866772023-10-20 Integration of pan-cancer multi-omics data for novel mixed subgroup identification using machine learning methods Khadirnaikar, Seema Shukla, Sudhanshu Prasanna, S. R. M. PLoS One Research Article Cancer is a heterogeneous disease, and patients with tumors from different organs can share similar epigenetic and genetic alterations. Therefore, it is crucial to identify the novel subgroups of patients with similar molecular characteristics. It is possible to propose a better treatment strategy when the heterogeneity of the patient is accounted for during subgroup identification, irrespective of the tissue of origin. This work proposes a machine learning (ML) based pipeline for subgroup identification in pan-cancer. Here, mRNA, miRNA, DNA methylation, and protein expression features from pan-cancer samples were concatenated and non-linearly projected to a lower dimension using an ML algorithm. This data was then clustered to identify multi-omics-based novel subgroups. The clinical characterization of these ML subgroups indicated significant differences in overall survival (OS) and disease-free survival (DFS) (p-value<0.0001). The subgroups formed by the patients from different tumors shared similar molecular alterations in terms of immune microenvironment, mutation profile, and enriched pathways. Further, decision-level and feature-level fused classification models were built to identify the novel subgroups for unseen samples. Additionally, the classification models were used to obtain the class labels for the validation samples, and the molecular characteristics were verified. To summarize, this work identified novel ML subgroups using multi-omics data and showed that the patients with different tumor types could be similar molecularly. We also proposed and validated the classification models for subgroup identification. The proposed classification models can be used to identify the novel multi-omics subgroups, and the molecular characteristics of each subgroup can be used to design appropriate treatment regimen. Public Library of Science 2023-10-19 /pmc/articles/PMC10586677/ /pubmed/37856446 http://dx.doi.org/10.1371/journal.pone.0287176 Text en © 2023 Khadirnaikar et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Khadirnaikar, Seema Shukla, Sudhanshu Prasanna, S. R. M. Integration of pan-cancer multi-omics data for novel mixed subgroup identification using machine learning methods |
title | Integration of pan-cancer multi-omics data for novel mixed subgroup identification using machine learning methods |
title_full | Integration of pan-cancer multi-omics data for novel mixed subgroup identification using machine learning methods |
title_fullStr | Integration of pan-cancer multi-omics data for novel mixed subgroup identification using machine learning methods |
title_full_unstemmed | Integration of pan-cancer multi-omics data for novel mixed subgroup identification using machine learning methods |
title_short | Integration of pan-cancer multi-omics data for novel mixed subgroup identification using machine learning methods |
title_sort | integration of pan-cancer multi-omics data for novel mixed subgroup identification using machine learning methods |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10586677/ https://www.ncbi.nlm.nih.gov/pubmed/37856446 http://dx.doi.org/10.1371/journal.pone.0287176 |
work_keys_str_mv | AT khadirnaikarseema integrationofpancancermultiomicsdatafornovelmixedsubgroupidentificationusingmachinelearningmethods AT shuklasudhanshu integrationofpancancermultiomicsdatafornovelmixedsubgroupidentificationusingmachinelearningmethods AT prasannasrm integrationofpancancermultiomicsdatafornovelmixedsubgroupidentificationusingmachinelearningmethods |