Cargando…
Improve consensus partitioning via a hierarchical procedure
Consensus partitioning is an unsupervised method widely used in high-throughput data analysis for revealing subgroups and assigning stability for the classification. However, standard consensus partitioning procedures are weak for identifying large numbers of stable subgroups. There are two major is...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9116221/ https://www.ncbi.nlm.nih.gov/pubmed/35289356 http://dx.doi.org/10.1093/bib/bbac048 |
_version_ | 1784710073257295872 |
---|---|
author | Gu, Zuguang Hübschmann, Daniel |
author_facet | Gu, Zuguang Hübschmann, Daniel |
author_sort | Gu, Zuguang |
collection | PubMed |
description | Consensus partitioning is an unsupervised method widely used in high-throughput data analysis for revealing subgroups and assigning stability for the classification. However, standard consensus partitioning procedures are weak for identifying large numbers of stable subgroups. There are two major issues. First, subgroups with small differences are difficult to be separated if they are simultaneously detected with subgroups with large differences. Second, stability of classification generally decreases as the number of subgroups increases. In this work, we proposed a new strategy to solve these two issues by applying consensus partitioning in a hierarchical procedure. We demonstrated hierarchical consensus partitioning can be efficient to reveal more meaningful subgroups. We also tested the performance of hierarchical consensus partitioning on revealing a great number of subgroups with a large deoxyribonucleic acid methylation dataset. The hierarchical consensus partitioning is implemented in the R package cola with comprehensive functionalities for analysis and visualization. It can also automate the analysis only with a minimum of two lines of code, which generates a detailed HTML report containing the complete analysis. The cola package is available at https://bioconductor.org/packages/cola/. |
format | Online Article Text |
id | pubmed-9116221 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-91162212022-05-19 Improve consensus partitioning via a hierarchical procedure Gu, Zuguang Hübschmann, Daniel Brief Bioinform Problem Solving Protocol Consensus partitioning is an unsupervised method widely used in high-throughput data analysis for revealing subgroups and assigning stability for the classification. However, standard consensus partitioning procedures are weak for identifying large numbers of stable subgroups. There are two major issues. First, subgroups with small differences are difficult to be separated if they are simultaneously detected with subgroups with large differences. Second, stability of classification generally decreases as the number of subgroups increases. In this work, we proposed a new strategy to solve these two issues by applying consensus partitioning in a hierarchical procedure. We demonstrated hierarchical consensus partitioning can be efficient to reveal more meaningful subgroups. We also tested the performance of hierarchical consensus partitioning on revealing a great number of subgroups with a large deoxyribonucleic acid methylation dataset. The hierarchical consensus partitioning is implemented in the R package cola with comprehensive functionalities for analysis and visualization. It can also automate the analysis only with a minimum of two lines of code, which generates a detailed HTML report containing the complete analysis. The cola package is available at https://bioconductor.org/packages/cola/. Oxford University Press 2022-03-14 /pmc/articles/PMC9116221/ /pubmed/35289356 http://dx.doi.org/10.1093/bib/bbac048 Text en © The Author(s) 2022. Published by Oxford University Press. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Problem Solving Protocol Gu, Zuguang Hübschmann, Daniel Improve consensus partitioning via a hierarchical procedure |
title | Improve consensus partitioning via a hierarchical procedure |
title_full | Improve consensus partitioning via a hierarchical procedure |
title_fullStr | Improve consensus partitioning via a hierarchical procedure |
title_full_unstemmed | Improve consensus partitioning via a hierarchical procedure |
title_short | Improve consensus partitioning via a hierarchical procedure |
title_sort | improve consensus partitioning via a hierarchical procedure |
topic | Problem Solving Protocol |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9116221/ https://www.ncbi.nlm.nih.gov/pubmed/35289356 http://dx.doi.org/10.1093/bib/bbac048 |
work_keys_str_mv | AT guzuguang improveconsensuspartitioningviaahierarchicalprocedure AT hubschmanndaniel improveconsensuspartitioningviaahierarchicalprocedure |