Cargando…

Integration of pan-cancer multi-omics data for novel mixed subgroup identification using machine learning methods

Cancer is a heterogeneous disease, and patients with tumors from different organs can share similar epigenetic and genetic alterations. Therefore, it is crucial to identify the novel subgroups of patients with similar molecular characteristics. It is possible to propose a better treatment strategy w...

Descripción completa

Detalles Bibliográficos
Autores principales: Khadirnaikar, Seema, Shukla, Sudhanshu, Prasanna, S. R. M.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10586677/
https://www.ncbi.nlm.nih.gov/pubmed/37856446
http://dx.doi.org/10.1371/journal.pone.0287176
_version_ 1785123196880551936
author Khadirnaikar, Seema
Shukla, Sudhanshu
Prasanna, S. R. M.
author_facet Khadirnaikar, Seema
Shukla, Sudhanshu
Prasanna, S. R. M.
author_sort Khadirnaikar, Seema
collection PubMed
description Cancer is a heterogeneous disease, and patients with tumors from different organs can share similar epigenetic and genetic alterations. Therefore, it is crucial to identify the novel subgroups of patients with similar molecular characteristics. It is possible to propose a better treatment strategy when the heterogeneity of the patient is accounted for during subgroup identification, irrespective of the tissue of origin. This work proposes a machine learning (ML) based pipeline for subgroup identification in pan-cancer. Here, mRNA, miRNA, DNA methylation, and protein expression features from pan-cancer samples were concatenated and non-linearly projected to a lower dimension using an ML algorithm. This data was then clustered to identify multi-omics-based novel subgroups. The clinical characterization of these ML subgroups indicated significant differences in overall survival (OS) and disease-free survival (DFS) (p-value<0.0001). The subgroups formed by the patients from different tumors shared similar molecular alterations in terms of immune microenvironment, mutation profile, and enriched pathways. Further, decision-level and feature-level fused classification models were built to identify the novel subgroups for unseen samples. Additionally, the classification models were used to obtain the class labels for the validation samples, and the molecular characteristics were verified. To summarize, this work identified novel ML subgroups using multi-omics data and showed that the patients with different tumor types could be similar molecularly. We also proposed and validated the classification models for subgroup identification. The proposed classification models can be used to identify the novel multi-omics subgroups, and the molecular characteristics of each subgroup can be used to design appropriate treatment regimen.
format Online
Article
Text
id pubmed-10586677
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-105866772023-10-20 Integration of pan-cancer multi-omics data for novel mixed subgroup identification using machine learning methods Khadirnaikar, Seema Shukla, Sudhanshu Prasanna, S. R. M. PLoS One Research Article Cancer is a heterogeneous disease, and patients with tumors from different organs can share similar epigenetic and genetic alterations. Therefore, it is crucial to identify the novel subgroups of patients with similar molecular characteristics. It is possible to propose a better treatment strategy when the heterogeneity of the patient is accounted for during subgroup identification, irrespective of the tissue of origin. This work proposes a machine learning (ML) based pipeline for subgroup identification in pan-cancer. Here, mRNA, miRNA, DNA methylation, and protein expression features from pan-cancer samples were concatenated and non-linearly projected to a lower dimension using an ML algorithm. This data was then clustered to identify multi-omics-based novel subgroups. The clinical characterization of these ML subgroups indicated significant differences in overall survival (OS) and disease-free survival (DFS) (p-value<0.0001). The subgroups formed by the patients from different tumors shared similar molecular alterations in terms of immune microenvironment, mutation profile, and enriched pathways. Further, decision-level and feature-level fused classification models were built to identify the novel subgroups for unseen samples. Additionally, the classification models were used to obtain the class labels for the validation samples, and the molecular characteristics were verified. To summarize, this work identified novel ML subgroups using multi-omics data and showed that the patients with different tumor types could be similar molecularly. We also proposed and validated the classification models for subgroup identification. The proposed classification models can be used to identify the novel multi-omics subgroups, and the molecular characteristics of each subgroup can be used to design appropriate treatment regimen. Public Library of Science 2023-10-19 /pmc/articles/PMC10586677/ /pubmed/37856446 http://dx.doi.org/10.1371/journal.pone.0287176 Text en © 2023 Khadirnaikar et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Khadirnaikar, Seema
Shukla, Sudhanshu
Prasanna, S. R. M.
Integration of pan-cancer multi-omics data for novel mixed subgroup identification using machine learning methods
title Integration of pan-cancer multi-omics data for novel mixed subgroup identification using machine learning methods
title_full Integration of pan-cancer multi-omics data for novel mixed subgroup identification using machine learning methods
title_fullStr Integration of pan-cancer multi-omics data for novel mixed subgroup identification using machine learning methods
title_full_unstemmed Integration of pan-cancer multi-omics data for novel mixed subgroup identification using machine learning methods
title_short Integration of pan-cancer multi-omics data for novel mixed subgroup identification using machine learning methods
title_sort integration of pan-cancer multi-omics data for novel mixed subgroup identification using machine learning methods
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10586677/
https://www.ncbi.nlm.nih.gov/pubmed/37856446
http://dx.doi.org/10.1371/journal.pone.0287176
work_keys_str_mv AT khadirnaikarseema integrationofpancancermultiomicsdatafornovelmixedsubgroupidentificationusingmachinelearningmethods
AT shuklasudhanshu integrationofpancancermultiomicsdatafornovelmixedsubgroupidentificationusingmachinelearningmethods
AT prasannasrm integrationofpancancermultiomicsdatafornovelmixedsubgroupidentificationusingmachinelearningmethods