Cargando…

GraphTeams: a method for discovering spatial gene clusters in Hi-C sequencing data

BACKGROUND: Hi-C sequencing offers novel, cost-effective means to study the spatial conformation of chromosomes. We use data obtained from Hi-C experiments to provide new evidence for the existence of spatial gene clusters. These are sets of genes with associated functionality that exhibit close pro...

Descripción completa

Detalles Bibliográficos
Autores principales:	Schulz, Tizian, Stoye, Jens, Doerr, Daniel
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2018
Materias:	Research
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5998887/ https://www.ncbi.nlm.nih.gov/pubmed/29745835 http://dx.doi.org/10.1186/s12864-018-4622-0

_version_	1783331323140636672
author	Schulz, Tizian Stoye, Jens Doerr, Daniel
author_facet	Schulz, Tizian Stoye, Jens Doerr, Daniel
author_sort	Schulz, Tizian
collection	PubMed
description	BACKGROUND: Hi-C sequencing offers novel, cost-effective means to study the spatial conformation of chromosomes. We use data obtained from Hi-C experiments to provide new evidence for the existence of spatial gene clusters. These are sets of genes with associated functionality that exhibit close proximity to each other in the spatial conformation of chromosomes across several related species. RESULTS: We present the first gene cluster model capable of handling spatial data. Our model generalizes a popular computational model for gene cluster prediction, called δ-teams, from sequences to graphs. Following previous lines of research, we subsequently extend our model to allow for several vertices being associated with the same label. The model, called δ-teams with families, is particular suitable for our application as it enables handling of gene duplicates. We develop algorithmic solutions for both models. We implemented the algorithm for discovering δ-teams with families and integrated it into a fully automated workflow for discovering gene clusters in Hi-C data, called GraphTeams. We applied it to human and mouse data to find intra- and interchromosomal gene cluster candidates. The results include intrachromosomal clusters that seem to exhibit a closer proximity in space than on their chromosomal DNA sequence. We further discovered interchromosomal gene clusters that contain genes from different chromosomes within the human genome, but are located on a single chromosome in mouse. CONCLUSIONS: By identifying δ-teams with families, we provide a flexible model to discover gene cluster candidates in Hi-C data. Our analysis of Hi-C data from human and mouse reveals several known gene clusters (thus validating our approach), but also few sparsely studied or possibly unknown gene cluster candidates that could be the source of further experimental investigations.
format	Online Article Text
id	pubmed-5998887
institution	National Center for Biotechnology Information
language	English
publishDate	2018
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-59988872018-06-25 GraphTeams: a method for discovering spatial gene clusters in Hi-C sequencing data Schulz, Tizian Stoye, Jens Doerr, Daniel BMC Genomics Research BACKGROUND: Hi-C sequencing offers novel, cost-effective means to study the spatial conformation of chromosomes. We use data obtained from Hi-C experiments to provide new evidence for the existence of spatial gene clusters. These are sets of genes with associated functionality that exhibit close proximity to each other in the spatial conformation of chromosomes across several related species. RESULTS: We present the first gene cluster model capable of handling spatial data. Our model generalizes a popular computational model for gene cluster prediction, called δ-teams, from sequences to graphs. Following previous lines of research, we subsequently extend our model to allow for several vertices being associated with the same label. The model, called δ-teams with families, is particular suitable for our application as it enables handling of gene duplicates. We develop algorithmic solutions for both models. We implemented the algorithm for discovering δ-teams with families and integrated it into a fully automated workflow for discovering gene clusters in Hi-C data, called GraphTeams. We applied it to human and mouse data to find intra- and interchromosomal gene cluster candidates. The results include intrachromosomal clusters that seem to exhibit a closer proximity in space than on their chromosomal DNA sequence. We further discovered interchromosomal gene clusters that contain genes from different chromosomes within the human genome, but are located on a single chromosome in mouse. CONCLUSIONS: By identifying δ-teams with families, we provide a flexible model to discover gene cluster candidates in Hi-C data. Our analysis of Hi-C data from human and mouse reveals several known gene clusters (thus validating our approach), but also few sparsely studied or possibly unknown gene cluster candidates that could be the source of further experimental investigations. BioMed Central 2018-05-08 /pmc/articles/PMC5998887/ /pubmed/29745835 http://dx.doi.org/10.1186/s12864-018-4622-0 Text en © The Author(s) 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle	Research Schulz, Tizian Stoye, Jens Doerr, Daniel GraphTeams: a method for discovering spatial gene clusters in Hi-C sequencing data
title	GraphTeams: a method for discovering spatial gene clusters in Hi-C sequencing data
title_full	GraphTeams: a method for discovering spatial gene clusters in Hi-C sequencing data
title_fullStr	GraphTeams: a method for discovering spatial gene clusters in Hi-C sequencing data
title_full_unstemmed	GraphTeams: a method for discovering spatial gene clusters in Hi-C sequencing data
title_short	GraphTeams: a method for discovering spatial gene clusters in Hi-C sequencing data
title_sort	graphteams: a method for discovering spatial gene clusters in hi-c sequencing data
topic	Research
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5998887/ https://www.ncbi.nlm.nih.gov/pubmed/29745835 http://dx.doi.org/10.1186/s12864-018-4622-0
work_keys_str_mv	AT schulztizian graphteamsamethodfordiscoveringspatialgeneclustersinhicsequencingdata AT stoyejens graphteamsamethodfordiscoveringspatialgeneclustersinhicsequencingdata AT doerrdaniel graphteamsamethodfordiscoveringspatialgeneclustersinhicsequencingdata

GraphTeams: a method for discovering spatial gene clusters in Hi-C sequencing data

Ejemplares similares