Cargando…

Patient subgrouping with distinct survival rates via integration of multiomics data on a Grassmann manifold

BACKGROUND: Patient subgroups are important for easily understanding a disease and for providing precise yet personalized treatment through multiple omics dataset integration. Multiomics datasets are produced daily. Thus, the fusion of heterogeneous big data into intrinsic structures is an urgent pr...

Descripción completa

Detalles Bibliográficos
Autores principales: Alfatemi, Ali, Peng, Hong, Rong, Wentao, Zhang, Bin, Cai, Hongmin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9308936/
https://www.ncbi.nlm.nih.gov/pubmed/35870923
http://dx.doi.org/10.1186/s12911-022-01938-y
Descripción
Sumario:BACKGROUND: Patient subgroups are important for easily understanding a disease and for providing precise yet personalized treatment through multiple omics dataset integration. Multiomics datasets are produced daily. Thus, the fusion of heterogeneous big data into intrinsic structures is an urgent problem. Novel mathematical methods are needed to process these data in a straightforward way. RESULTS: We developed a novel method for subgrouping patients with distinct survival rates via the integration of multiple omics datasets and by using principal component analysis to reduce the high data dimensionality. Then, we constructed similarity graphs for patients, merged the graphs in a subspace, and analyzed them on a Grassmann manifold. The proposed method could identify patient subgroups that had not been reported previously by selecting the most critical information during the merging at each level of the omics dataset. Our method was tested on empirical multiomics datasets from The Cancer Genome Atlas. CONCLUSION: Through the integration of microRNA, gene expression, and DNA methylation data, our method accurately identified patient subgroups and achieved superior performance compared with popular methods. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12911-022-01938-y.