Cargando…

Evaluation and comparison of multi-omics data integration methods for cancer subtyping

Computational integrative analysis has become a significant approach in the data-driven exploration of biological problems. Many integration methods for cancer subtyping have been proposed, but evaluating these methods has become a complicated problem due to the lack of gold standards. Moreover, que...

Descripción completa

Detalles Bibliográficos
Autores principales: Duan, Ran, Gao, Lin, Gao, Yong, Hu, Yuxuan, Xu, Han, Huang, Mingfeng, Song, Kuo, Wang, Hongda, Dong, Yongqiang, Jiang, Chaoqun, Zhang, Chenxing, Jia, Songwei
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8384175/
https://www.ncbi.nlm.nih.gov/pubmed/34383739
http://dx.doi.org/10.1371/journal.pcbi.1009224
_version_ 1783741863712260096
author Duan, Ran
Gao, Lin
Gao, Yong
Hu, Yuxuan
Xu, Han
Huang, Mingfeng
Song, Kuo
Wang, Hongda
Dong, Yongqiang
Jiang, Chaoqun
Zhang, Chenxing
Jia, Songwei
author_facet Duan, Ran
Gao, Lin
Gao, Yong
Hu, Yuxuan
Xu, Han
Huang, Mingfeng
Song, Kuo
Wang, Hongda
Dong, Yongqiang
Jiang, Chaoqun
Zhang, Chenxing
Jia, Songwei
author_sort Duan, Ran
collection PubMed
description Computational integrative analysis has become a significant approach in the data-driven exploration of biological problems. Many integration methods for cancer subtyping have been proposed, but evaluating these methods has become a complicated problem due to the lack of gold standards. Moreover, questions of practical importance remain to be addressed regarding the impact of selecting appropriate data types and combinations on the performance of integrative studies. Here, we constructed three classes of benchmarking datasets of nine cancers in TCGA by considering all the eleven combinations of four multi-omics data types. Using these datasets, we conducted a comprehensive evaluation of ten representative integration methods for cancer subtyping in terms of accuracy measured by combining both clustering accuracy and clinical significance, robustness, and computational efficiency. We subsequently investigated the influence of different omics data on cancer subtyping and the effectiveness of their combinations. Refuting the widely held intuition that incorporating more types of omics data always produces better results, our analyses showed that there are situations where integrating more omics data negatively impacts the performance of integration methods. Our analyses also suggested several effective combinations for most cancers under our studies, which may be of particular interest to researchers in omics data analysis.
format Online
Article
Text
id pubmed-8384175
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-83841752021-08-25 Evaluation and comparison of multi-omics data integration methods for cancer subtyping Duan, Ran Gao, Lin Gao, Yong Hu, Yuxuan Xu, Han Huang, Mingfeng Song, Kuo Wang, Hongda Dong, Yongqiang Jiang, Chaoqun Zhang, Chenxing Jia, Songwei PLoS Comput Biol Research Article Computational integrative analysis has become a significant approach in the data-driven exploration of biological problems. Many integration methods for cancer subtyping have been proposed, but evaluating these methods has become a complicated problem due to the lack of gold standards. Moreover, questions of practical importance remain to be addressed regarding the impact of selecting appropriate data types and combinations on the performance of integrative studies. Here, we constructed three classes of benchmarking datasets of nine cancers in TCGA by considering all the eleven combinations of four multi-omics data types. Using these datasets, we conducted a comprehensive evaluation of ten representative integration methods for cancer subtyping in terms of accuracy measured by combining both clustering accuracy and clinical significance, robustness, and computational efficiency. We subsequently investigated the influence of different omics data on cancer subtyping and the effectiveness of their combinations. Refuting the widely held intuition that incorporating more types of omics data always produces better results, our analyses showed that there are situations where integrating more omics data negatively impacts the performance of integration methods. Our analyses also suggested several effective combinations for most cancers under our studies, which may be of particular interest to researchers in omics data analysis. Public Library of Science 2021-08-12 /pmc/articles/PMC8384175/ /pubmed/34383739 http://dx.doi.org/10.1371/journal.pcbi.1009224 Text en © 2021 Duan et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Duan, Ran
Gao, Lin
Gao, Yong
Hu, Yuxuan
Xu, Han
Huang, Mingfeng
Song, Kuo
Wang, Hongda
Dong, Yongqiang
Jiang, Chaoqun
Zhang, Chenxing
Jia, Songwei
Evaluation and comparison of multi-omics data integration methods for cancer subtyping
title Evaluation and comparison of multi-omics data integration methods for cancer subtyping
title_full Evaluation and comparison of multi-omics data integration methods for cancer subtyping
title_fullStr Evaluation and comparison of multi-omics data integration methods for cancer subtyping
title_full_unstemmed Evaluation and comparison of multi-omics data integration methods for cancer subtyping
title_short Evaluation and comparison of multi-omics data integration methods for cancer subtyping
title_sort evaluation and comparison of multi-omics data integration methods for cancer subtyping
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8384175/
https://www.ncbi.nlm.nih.gov/pubmed/34383739
http://dx.doi.org/10.1371/journal.pcbi.1009224
work_keys_str_mv AT duanran evaluationandcomparisonofmultiomicsdataintegrationmethodsforcancersubtyping
AT gaolin evaluationandcomparisonofmultiomicsdataintegrationmethodsforcancersubtyping
AT gaoyong evaluationandcomparisonofmultiomicsdataintegrationmethodsforcancersubtyping
AT huyuxuan evaluationandcomparisonofmultiomicsdataintegrationmethodsforcancersubtyping
AT xuhan evaluationandcomparisonofmultiomicsdataintegrationmethodsforcancersubtyping
AT huangmingfeng evaluationandcomparisonofmultiomicsdataintegrationmethodsforcancersubtyping
AT songkuo evaluationandcomparisonofmultiomicsdataintegrationmethodsforcancersubtyping
AT wanghongda evaluationandcomparisonofmultiomicsdataintegrationmethodsforcancersubtyping
AT dongyongqiang evaluationandcomparisonofmultiomicsdataintegrationmethodsforcancersubtyping
AT jiangchaoqun evaluationandcomparisonofmultiomicsdataintegrationmethodsforcancersubtyping
AT zhangchenxing evaluationandcomparisonofmultiomicsdataintegrationmethodsforcancersubtyping
AT jiasongwei evaluationandcomparisonofmultiomicsdataintegrationmethodsforcancersubtyping