Cargando…

Large-scale clustering of CAGE tag expression data

BACKGROUND: Recent analyses have suggested that many genes possess multiple transcription start sites (TSSs) that are differentially utilized in different tissues and cell lines. We have identified a huge number of TSSs mapped onto the mouse genome using the cap analysis of gene expression (CAGE) me...

Descripción completa

Detalles Bibliográficos
Autores principales:	Shimokawa, Kazuro, Okamura-Oho, Yuko, Kurita, Takio, Frith, Martin C, Kawai, Jun, Carninci, Piero, Hayashizaki, Yoshihide
Formato:	Texto
Lenguaje:	English
Publicado:	BioMed Central 2007
Materias:	Methodology Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1890301/ https://www.ncbi.nlm.nih.gov/pubmed/17517134 http://dx.doi.org/10.1186/1471-2105-8-161

_version_	1782133716908244992
author	Shimokawa, Kazuro Okamura-Oho, Yuko Kurita, Takio Frith, Martin C Kawai, Jun Carninci, Piero Hayashizaki, Yoshihide
author_facet	Shimokawa, Kazuro Okamura-Oho, Yuko Kurita, Takio Frith, Martin C Kawai, Jun Carninci, Piero Hayashizaki, Yoshihide
author_sort	Shimokawa, Kazuro
collection	PubMed
description	BACKGROUND: Recent analyses have suggested that many genes possess multiple transcription start sites (TSSs) that are differentially utilized in different tissues and cell lines. We have identified a huge number of TSSs mapped onto the mouse genome using the cap analysis of gene expression (CAGE) method. The standard hierarchical clustering algorithm, which gives us easily understandable graphical tree images, has difficulties in processing such huge amounts of TSS data and a better method to calculate and display the results is needed. RESULTS: We use a combination of hierarchical and non-hierarchical clustering to cluster expression profiles of TSSs based on a large amount of CAGE data to profit from the best of both methods. We processed the genome-wide expression data, including 159,075 TSSs derived from 127 RNA samples of various organs of mouse, and succeeded in categorizing them into 70–100 clusters. The clusters exhibited intriguing biological features: a cluster supergroup with a ubiquitous expression profile, tissue-specific patterns, a distinct distribution of non-coding RNA and functional TSS groups. CONCLUSION: Our approach succeeded in greatly reducing the calculation cost, and is an appropriate solution for analyzing large-scale TSS usage data.
format	Text
id	pubmed-1890301
institution	National Center for Biotechnology Information
language	English
publishDate	2007
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-18903012007-06-08 Large-scale clustering of CAGE tag expression data Shimokawa, Kazuro Okamura-Oho, Yuko Kurita, Takio Frith, Martin C Kawai, Jun Carninci, Piero Hayashizaki, Yoshihide BMC Bioinformatics Methodology Article BACKGROUND: Recent analyses have suggested that many genes possess multiple transcription start sites (TSSs) that are differentially utilized in different tissues and cell lines. We have identified a huge number of TSSs mapped onto the mouse genome using the cap analysis of gene expression (CAGE) method. The standard hierarchical clustering algorithm, which gives us easily understandable graphical tree images, has difficulties in processing such huge amounts of TSS data and a better method to calculate and display the results is needed. RESULTS: We use a combination of hierarchical and non-hierarchical clustering to cluster expression profiles of TSSs based on a large amount of CAGE data to profit from the best of both methods. We processed the genome-wide expression data, including 159,075 TSSs derived from 127 RNA samples of various organs of mouse, and succeeded in categorizing them into 70–100 clusters. The clusters exhibited intriguing biological features: a cluster supergroup with a ubiquitous expression profile, tissue-specific patterns, a distinct distribution of non-coding RNA and functional TSS groups. CONCLUSION: Our approach succeeded in greatly reducing the calculation cost, and is an appropriate solution for analyzing large-scale TSS usage data. BioMed Central 2007-05-21 /pmc/articles/PMC1890301/ /pubmed/17517134 http://dx.doi.org/10.1186/1471-2105-8-161 Text en Copyright © 2007 Shimokawa et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Methodology Article Shimokawa, Kazuro Okamura-Oho, Yuko Kurita, Takio Frith, Martin C Kawai, Jun Carninci, Piero Hayashizaki, Yoshihide Large-scale clustering of CAGE tag expression data
title	Large-scale clustering of CAGE tag expression data
title_full	Large-scale clustering of CAGE tag expression data
title_fullStr	Large-scale clustering of CAGE tag expression data
title_full_unstemmed	Large-scale clustering of CAGE tag expression data
title_short	Large-scale clustering of CAGE tag expression data
title_sort	large-scale clustering of cage tag expression data
topic	Methodology Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1890301/ https://www.ncbi.nlm.nih.gov/pubmed/17517134 http://dx.doi.org/10.1186/1471-2105-8-161
work_keys_str_mv	AT shimokawakazuro largescaleclusteringofcagetagexpressiondata AT okamuraohoyuko largescaleclusteringofcagetagexpressiondata AT kuritatakio largescaleclusteringofcagetagexpressiondata AT frithmartinc largescaleclusteringofcagetagexpressiondata AT kawaijun largescaleclusteringofcagetagexpressiondata AT carnincipiero largescaleclusteringofcagetagexpressiondata AT hayashizakiyoshihide largescaleclusteringofcagetagexpressiondata

Large-scale clustering of CAGE tag expression data

Ejemplares similares