Cargando…

cola: an R/Bioconductor package for consensus partitioning through a general framework

Classification of high-throughput genomic data is a powerful method to assign samples to subgroups with specific molecular profiles. Consensus partitioning is the most widely applied approach to reveal subgroups by summarizing a consensus classification from a list of individual classifications gene...

Descripción completa

Detalles Bibliográficos
Autores principales:	Gu, Zuguang, Schlesner, Matthias, Hübschmann, Daniel
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Oxford University Press 2020
Materias:	Methods Online
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7897501/ https://www.ncbi.nlm.nih.gov/pubmed/33275159 http://dx.doi.org/10.1093/nar/gkaa1146

_version_	1783653682142773248
author	Gu, Zuguang Schlesner, Matthias Hübschmann, Daniel
author_facet	Gu, Zuguang Schlesner, Matthias Hübschmann, Daniel
author_sort	Gu, Zuguang
collection	PubMed
description	Classification of high-throughput genomic data is a powerful method to assign samples to subgroups with specific molecular profiles. Consensus partitioning is the most widely applied approach to reveal subgroups by summarizing a consensus classification from a list of individual classifications generated by repeatedly executing clustering on random subsets of the data. It is able to evaluate the stability of the classification. We implemented a new R/Bioconductor package, cola, that provides a general framework for consensus partitioning. With cola, various parameters and methods can be user-defined and easily integrated into different steps of an analysis, e.g., feature selection, sample classification or defining signatures. cola provides a new method named ATC (ability to correlate to other rows) to extract features and recommends spherical k-means clustering (skmeans) for subgroup classification. We show that ATC and skmeans have better performance than other commonly used methods by a comprehensive benchmark on public datasets. We also benchmark key parameters in the consensus partitioning procedure, which helps users to select optimal parameter values. Moreover, cola provides rich functionalities to apply multiple partitioning methods in parallel and directly compare their results, as well as rich visualizations. cola can automate the complete analysis and generates a comprehensive HTML report.
format	Online Article Text
id	pubmed-7897501
institution	National Center for Biotechnology Information
language	English
publishDate	2020
publisher	Oxford University Press
record_format	MEDLINE/PubMed
spelling	pubmed-78975012021-02-25 cola: an R/Bioconductor package for consensus partitioning through a general framework Gu, Zuguang Schlesner, Matthias Hübschmann, Daniel Nucleic Acids Res Methods Online Classification of high-throughput genomic data is a powerful method to assign samples to subgroups with specific molecular profiles. Consensus partitioning is the most widely applied approach to reveal subgroups by summarizing a consensus classification from a list of individual classifications generated by repeatedly executing clustering on random subsets of the data. It is able to evaluate the stability of the classification. We implemented a new R/Bioconductor package, cola, that provides a general framework for consensus partitioning. With cola, various parameters and methods can be user-defined and easily integrated into different steps of an analysis, e.g., feature selection, sample classification or defining signatures. cola provides a new method named ATC (ability to correlate to other rows) to extract features and recommends spherical k-means clustering (skmeans) for subgroup classification. We show that ATC and skmeans have better performance than other commonly used methods by a comprehensive benchmark on public datasets. We also benchmark key parameters in the consensus partitioning procedure, which helps users to select optimal parameter values. Moreover, cola provides rich functionalities to apply multiple partitioning methods in parallel and directly compare their results, as well as rich visualizations. cola can automate the complete analysis and generates a comprehensive HTML report. Oxford University Press 2020-12-04 /pmc/articles/PMC7897501/ /pubmed/33275159 http://dx.doi.org/10.1093/nar/gkaa1146 Text en © The Author(s) 2020. Published by Oxford University Press on behalf of Nucleic Acids Research. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle	Methods Online Gu, Zuguang Schlesner, Matthias Hübschmann, Daniel cola: an R/Bioconductor package for consensus partitioning through a general framework
title	cola: an R/Bioconductor package for consensus partitioning through a general framework
title_full	cola: an R/Bioconductor package for consensus partitioning through a general framework
title_fullStr	cola: an R/Bioconductor package for consensus partitioning through a general framework
title_full_unstemmed	cola: an R/Bioconductor package for consensus partitioning through a general framework
title_short	cola: an R/Bioconductor package for consensus partitioning through a general framework
title_sort	cola: an r/bioconductor package for consensus partitioning through a general framework
topic	Methods Online
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7897501/ https://www.ncbi.nlm.nih.gov/pubmed/33275159 http://dx.doi.org/10.1093/nar/gkaa1146
work_keys_str_mv	AT guzuguang colaanrbioconductorpackageforconsensuspartitioningthroughageneralframework AT schlesnermatthias colaanrbioconductorpackageforconsensuspartitioningthroughageneralframework AT hubschmanndaniel colaanrbioconductorpackageforconsensuspartitioningthroughageneralframework

cola: an R/Bioconductor package for consensus partitioning through a general framework

Ejemplares similares