Cargando…

Efficient weighted univariate clustering maps outstanding dysregulated genomic zones in human cancers

MOTIVATION: Chromosomal patterning of gene expression in cancer can arise from aneuploidy, genome disorganization or abnormal DNA methylation. To map such patterns, we introduce a weighted univariate clustering algorithm to guarantee linear runtime, optimality and reproducibility. RESULTS: We presen...

Descripción completa

Detalles Bibliográficos
Autores principales: Song, Mingzhou, Zhong, Hua
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7755420/
https://www.ncbi.nlm.nih.gov/pubmed/32619008
http://dx.doi.org/10.1093/bioinformatics/btaa613
Descripción
Sumario:MOTIVATION: Chromosomal patterning of gene expression in cancer can arise from aneuploidy, genome disorganization or abnormal DNA methylation. To map such patterns, we introduce a weighted univariate clustering algorithm to guarantee linear runtime, optimality and reproducibility. RESULTS: We present the chromosome clustering method, establish its optimality and runtime and evaluate its performance. It uses dynamic programming enhanced with an algorithm to reduce search-space in-place to decrease runtime overhead. Using the method, we delineated outstanding genomic zones in 17 human cancer types. We identified strong continuity in dysregulation polarity—dominance by either up- or downregulated genes in a zone—along chromosomes in all cancer types. Significantly polarized dysregulation zones specific to cancer types are found, offering potential diagnostic biomarkers. Unreported previously, a total of 109 loci with conserved dysregulation polarity across cancer types give insights into pan-cancer mechanisms. Efficient chromosomal clustering opens a window to characterize molecular patterns in cancer genome and beyond. AVAILABILITY AND IMPLEMENTATION: Weighted univariate clustering algorithms are implemented within the R package ‘Ckmeans.1d.dp’ (4.0.0 or above), freely available at https://cran.r-project.org/package=Ckmeans.1d.dp. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.