Cargando…
Identifying gene clusters by discovering common intervals in indeterminate strings
BACKGROUND: Comparative analyses of chromosomal gene orders are successfully used to predict gene clusters in bacterial and fungal genomes. Present models for detecting sets of co-localized genes in chromosomal sequences require prior knowledge of gene family assignments of genes in the dataset of i...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4274641/ https://www.ncbi.nlm.nih.gov/pubmed/25571793 http://dx.doi.org/10.1186/1471-2164-15-S6-S2 |
_version_ | 1782350008472829952 |
---|---|
author | Doerr, Daniel Stoye, Jens Böcker, Sebastian Jahn, Katharina |
author_facet | Doerr, Daniel Stoye, Jens Böcker, Sebastian Jahn, Katharina |
author_sort | Doerr, Daniel |
collection | PubMed |
description | BACKGROUND: Comparative analyses of chromosomal gene orders are successfully used to predict gene clusters in bacterial and fungal genomes. Present models for detecting sets of co-localized genes in chromosomal sequences require prior knowledge of gene family assignments of genes in the dataset of interest. These families are often computationally predicted on the basis of sequence similarity or higher order features of gene products. Errors introduced in this process amplify in subsequent gene order analyses and thus may deteriorate gene cluster prediction. RESULTS: In this work, we present a new dynamic model and efficient computational approaches for gene cluster prediction suitable in scenarios ranging from traditional gene family-based gene cluster prediction, via multiple conflicting gene family annotations, to gene family-free analysis, in which gene clusters are predicted solely on the basis of a pairwise similarity measure of the genes of different genomes. We evaluate our gene family-free model against a gene family-based model on a dataset of 93 bacterial genomes. CONCLUSIONS: Our model is able to detect gene clusters that would be also detected with well-established gene family-based approaches. Moreover, we show that it is able to detect conserved regions which are missed by gene family-based methods due to wrong or deficient gene family assignments. |
format | Online Article Text |
id | pubmed-4274641 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-42746412015-01-02 Identifying gene clusters by discovering common intervals in indeterminate strings Doerr, Daniel Stoye, Jens Böcker, Sebastian Jahn, Katharina BMC Genomics Research BACKGROUND: Comparative analyses of chromosomal gene orders are successfully used to predict gene clusters in bacterial and fungal genomes. Present models for detecting sets of co-localized genes in chromosomal sequences require prior knowledge of gene family assignments of genes in the dataset of interest. These families are often computationally predicted on the basis of sequence similarity or higher order features of gene products. Errors introduced in this process amplify in subsequent gene order analyses and thus may deteriorate gene cluster prediction. RESULTS: In this work, we present a new dynamic model and efficient computational approaches for gene cluster prediction suitable in scenarios ranging from traditional gene family-based gene cluster prediction, via multiple conflicting gene family annotations, to gene family-free analysis, in which gene clusters are predicted solely on the basis of a pairwise similarity measure of the genes of different genomes. We evaluate our gene family-free model against a gene family-based model on a dataset of 93 bacterial genomes. CONCLUSIONS: Our model is able to detect gene clusters that would be also detected with well-established gene family-based approaches. Moreover, we show that it is able to detect conserved regions which are missed by gene family-based methods due to wrong or deficient gene family assignments. BioMed Central 2014-10-17 /pmc/articles/PMC4274641/ /pubmed/25571793 http://dx.doi.org/10.1186/1471-2164-15-S6-S2 Text en Copyright © 2014 Doerr et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/4.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Doerr, Daniel Stoye, Jens Böcker, Sebastian Jahn, Katharina Identifying gene clusters by discovering common intervals in indeterminate strings |
title | Identifying gene clusters by discovering common intervals in indeterminate
strings |
title_full | Identifying gene clusters by discovering common intervals in indeterminate
strings |
title_fullStr | Identifying gene clusters by discovering common intervals in indeterminate
strings |
title_full_unstemmed | Identifying gene clusters by discovering common intervals in indeterminate
strings |
title_short | Identifying gene clusters by discovering common intervals in indeterminate
strings |
title_sort | identifying gene clusters by discovering common intervals in indeterminate
strings |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4274641/ https://www.ncbi.nlm.nih.gov/pubmed/25571793 http://dx.doi.org/10.1186/1471-2164-15-S6-S2 |
work_keys_str_mv | AT doerrdaniel identifyinggeneclustersbydiscoveringcommonintervalsinindeterminatestrings AT stoyejens identifyinggeneclustersbydiscoveringcommonintervalsinindeterminatestrings AT bockersebastian identifyinggeneclustersbydiscoveringcommonintervalsinindeterminatestrings AT jahnkatharina identifyinggeneclustersbydiscoveringcommonintervalsinindeterminatestrings |