Cargando…

Multiple‐cumulative probabilities used to cluster and visualize transcriptomes

Analysis of gene expression data by clustering and visualizing played a central role in obtaining biological knowledge. Here, we used Pearson's correlation coefficient of multiple‐cumulative probabilities (PCC‐MCP) of genes to define the similarity of gene expression behaviors. To answer the ch...

Descripción completa

Detalles Bibliográficos
Autores principales: Jia, Xingang, Liu, Yisu, Han, Qiuhong, Lu, Zuhong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: John Wiley and Sons Inc. 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5715267/
https://www.ncbi.nlm.nih.gov/pubmed/29226087
http://dx.doi.org/10.1002/2211-5463.12327
Descripción
Sumario:Analysis of gene expression data by clustering and visualizing played a central role in obtaining biological knowledge. Here, we used Pearson's correlation coefficient of multiple‐cumulative probabilities (PCC‐MCP) of genes to define the similarity of gene expression behaviors. To answer the challenge of the high‐dimensional MCPs, we used icc‐cluster, a clustering algorithm that obtained solutions by iterating clustering centers, with PCC‐MCP to group genes. We then used t‐statistic stochastic neighbor embedding (t‐SNE) of KC‐data to generate optimal maps for clusters of MCP (t‐SNE‐MCP‐O maps). From the analysis of several transcriptome data sets, we demonstrated clear advantages for using icc‐cluster with PCC‐MCP over commonly used clustering methods. t‐SNE‐MCP‐O was also shown to give clearly projecting boundaries for clusters of PCC‐MCP, which made the relationships between clusters easy to visualize and understand.