Cargando…

Bidirectional best hit r-window gene clusters

BACKGROUND: Conserved gene clusters are groups of genes that are located close to one another in the genomes of several species. They tend to code for proteins that have a functional interaction. The identification of conserved gene clusters is an important step towards understanding genome evolutio...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Melvin, Leong, Hon Wai
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3009537/
https://www.ncbi.nlm.nih.gov/pubmed/20122239
http://dx.doi.org/10.1186/1471-2105-11-S1-S63
_version_ 1782194701849329664
author Zhang, Melvin
Leong, Hon Wai
author_facet Zhang, Melvin
Leong, Hon Wai
author_sort Zhang, Melvin
collection PubMed
description BACKGROUND: Conserved gene clusters are groups of genes that are located close to one another in the genomes of several species. They tend to code for proteins that have a functional interaction. The identification of conserved gene clusters is an important step towards understanding genome evolution and predicting gene function. RESULTS: In this paper, we propose a novel pairwise gene cluster model that combines the notion of bidirectional best hits with the r-window model introduced in 2003 by Durand and Sankoff. The bidirectional best hit (BBH) constraint removes the need to specify the minimum number of shared genes in the r-window model and improves the relevance of the results. We design a subquadratic time algorithm to compute the set of BBH r-window gene clusters efficiently. CONCLUSION: We apply our cluster model to the comparative analysis of E. coli K-12 and B. subtilis and perform an extensive comparison between our new model and the gene teams model developed by Bergeron et al. As compared to the gene teams model, our new cluster model has a slightly lower recall but a higher precision at all levels of recall when the results were ranked using statistical tests. An analysis of the most significant BBH r-window gene cluster show that they correspond to known operons.
format Text
id pubmed-3009537
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-30095372010-12-23 Bidirectional best hit r-window gene clusters Zhang, Melvin Leong, Hon Wai BMC Bioinformatics Research BACKGROUND: Conserved gene clusters are groups of genes that are located close to one another in the genomes of several species. They tend to code for proteins that have a functional interaction. The identification of conserved gene clusters is an important step towards understanding genome evolution and predicting gene function. RESULTS: In this paper, we propose a novel pairwise gene cluster model that combines the notion of bidirectional best hits with the r-window model introduced in 2003 by Durand and Sankoff. The bidirectional best hit (BBH) constraint removes the need to specify the minimum number of shared genes in the r-window model and improves the relevance of the results. We design a subquadratic time algorithm to compute the set of BBH r-window gene clusters efficiently. CONCLUSION: We apply our cluster model to the comparative analysis of E. coli K-12 and B. subtilis and perform an extensive comparison between our new model and the gene teams model developed by Bergeron et al. As compared to the gene teams model, our new cluster model has a slightly lower recall but a higher precision at all levels of recall when the results were ranked using statistical tests. An analysis of the most significant BBH r-window gene cluster show that they correspond to known operons. BioMed Central 2010-01-18 /pmc/articles/PMC3009537/ /pubmed/20122239 http://dx.doi.org/10.1186/1471-2105-11-S1-S63 Text en Copyright ©2010 Zhang and Leong; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Zhang, Melvin
Leong, Hon Wai
Bidirectional best hit r-window gene clusters
title Bidirectional best hit r-window gene clusters
title_full Bidirectional best hit r-window gene clusters
title_fullStr Bidirectional best hit r-window gene clusters
title_full_unstemmed Bidirectional best hit r-window gene clusters
title_short Bidirectional best hit r-window gene clusters
title_sort bidirectional best hit r-window gene clusters
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3009537/
https://www.ncbi.nlm.nih.gov/pubmed/20122239
http://dx.doi.org/10.1186/1471-2105-11-S1-S63
work_keys_str_mv AT zhangmelvin bidirectionalbesthitrwindowgeneclusters
AT leonghonwai bidirectionalbesthitrwindowgeneclusters