Cargando…

Improve consensus partitioning via a hierarchical procedure

Consensus partitioning is an unsupervised method widely used in high-throughput data analysis for revealing subgroups and assigning stability for the classification. However, standard consensus partitioning procedures are weak for identifying large numbers of stable subgroups. There are two major is...

Descripción completa

Detalles Bibliográficos
Autores principales: Gu, Zuguang, Hübschmann, Daniel
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9116221/
https://www.ncbi.nlm.nih.gov/pubmed/35289356
http://dx.doi.org/10.1093/bib/bbac048
_version_ 1784710073257295872
author Gu, Zuguang
Hübschmann, Daniel
author_facet Gu, Zuguang
Hübschmann, Daniel
author_sort Gu, Zuguang
collection PubMed
description Consensus partitioning is an unsupervised method widely used in high-throughput data analysis for revealing subgroups and assigning stability for the classification. However, standard consensus partitioning procedures are weak for identifying large numbers of stable subgroups. There are two major issues. First, subgroups with small differences are difficult to be separated if they are simultaneously detected with subgroups with large differences. Second, stability of classification generally decreases as the number of subgroups increases. In this work, we proposed a new strategy to solve these two issues by applying consensus partitioning in a hierarchical procedure. We demonstrated hierarchical consensus partitioning can be efficient to reveal more meaningful subgroups. We also tested the performance of hierarchical consensus partitioning on revealing a great number of subgroups with a large deoxyribonucleic acid methylation dataset. The hierarchical consensus partitioning is implemented in the R package cola with comprehensive functionalities for analysis and visualization. It can also automate the analysis only with a minimum of two lines of code, which generates a detailed HTML report containing the complete analysis. The cola package is available at https://bioconductor.org/packages/cola/.
format Online
Article
Text
id pubmed-9116221
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-91162212022-05-19 Improve consensus partitioning via a hierarchical procedure Gu, Zuguang Hübschmann, Daniel Brief Bioinform Problem Solving Protocol Consensus partitioning is an unsupervised method widely used in high-throughput data analysis for revealing subgroups and assigning stability for the classification. However, standard consensus partitioning procedures are weak for identifying large numbers of stable subgroups. There are two major issues. First, subgroups with small differences are difficult to be separated if they are simultaneously detected with subgroups with large differences. Second, stability of classification generally decreases as the number of subgroups increases. In this work, we proposed a new strategy to solve these two issues by applying consensus partitioning in a hierarchical procedure. We demonstrated hierarchical consensus partitioning can be efficient to reveal more meaningful subgroups. We also tested the performance of hierarchical consensus partitioning on revealing a great number of subgroups with a large deoxyribonucleic acid methylation dataset. The hierarchical consensus partitioning is implemented in the R package cola with comprehensive functionalities for analysis and visualization. It can also automate the analysis only with a minimum of two lines of code, which generates a detailed HTML report containing the complete analysis. The cola package is available at https://bioconductor.org/packages/cola/. Oxford University Press 2022-03-14 /pmc/articles/PMC9116221/ /pubmed/35289356 http://dx.doi.org/10.1093/bib/bbac048 Text en © The Author(s) 2022. Published by Oxford University Press. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Problem Solving Protocol
Gu, Zuguang
Hübschmann, Daniel
Improve consensus partitioning via a hierarchical procedure
title Improve consensus partitioning via a hierarchical procedure
title_full Improve consensus partitioning via a hierarchical procedure
title_fullStr Improve consensus partitioning via a hierarchical procedure
title_full_unstemmed Improve consensus partitioning via a hierarchical procedure
title_short Improve consensus partitioning via a hierarchical procedure
title_sort improve consensus partitioning via a hierarchical procedure
topic Problem Solving Protocol
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9116221/
https://www.ncbi.nlm.nih.gov/pubmed/35289356
http://dx.doi.org/10.1093/bib/bbac048
work_keys_str_mv AT guzuguang improveconsensuspartitioningviaahierarchicalprocedure
AT hubschmanndaniel improveconsensuspartitioningviaahierarchicalprocedure