Cargando…

Discovery of multi-operon colinear syntenic blocks in microbial genomes

MOTIVATION: An important task in comparative genomics is to detect functional units by analyzing gene-context patterns. Colinear syntenic blocks (CSBs) are groups of genes that are consistently encoded in the same neighborhood and in the same order across a wide range of taxa. Such CSBs are likely e...

Descripción completa

Detalles Bibliográficos
Autores principales: Svetlitsky, Dina, Dagan, Tal, Ziv-Ukelson, Michal
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7355258/
https://www.ncbi.nlm.nih.gov/pubmed/32657415
http://dx.doi.org/10.1093/bioinformatics/btaa503
_version_ 1783558238778687488
author Svetlitsky, Dina
Dagan, Tal
Ziv-Ukelson, Michal
author_facet Svetlitsky, Dina
Dagan, Tal
Ziv-Ukelson, Michal
author_sort Svetlitsky, Dina
collection PubMed
description MOTIVATION: An important task in comparative genomics is to detect functional units by analyzing gene-context patterns. Colinear syntenic blocks (CSBs) are groups of genes that are consistently encoded in the same neighborhood and in the same order across a wide range of taxa. Such CSBs are likely essential for the regulation of gene expression in prokaryotes. Recent results indicate that colinearity can be conserved across multiple operons, thus motivating the discovery of multi-operon CSBs. This computational task raises scalability challenges in large datasets. RESULTS: We propose an efficient algorithm for the discovery of cross-strand multi-operon CSBs in large genomic datasets. The proposed algorithm uses match-point arithmetic, which is scalable for large datasets of microbial genomes in terms of running time and space requirements. The algorithm is implemented and incorporated into a tool with a graphical user interface, called CSBFinder-S. We applied CSBFinder-S to data mine 1485 prokaryotic genomes and analyzed the identified cross-strand CSBs. Our results indicate that most of the syntenic blocks are exclusively colinear. Additional results indicate that transcriptional regulation by overlapping transcriptional genes is abundant in bacteria. We demonstrate the utility of CSBFinder-S to identify common function of the gene-pair PulEF in multiple contexts, including Type 2 Secretion System, Type 4 Pilus System and DNA uptake machinery. AVAILABILITY AND IMPLEMENTATION: CSBFinder-S software and code are publicly available at https://github.com/dinasv/CSBFinder. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-7355258
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-73552582020-07-16 Discovery of multi-operon colinear syntenic blocks in microbial genomes Svetlitsky, Dina Dagan, Tal Ziv-Ukelson, Michal Bioinformatics Bioinformatics of Microbes and Microbiomes MOTIVATION: An important task in comparative genomics is to detect functional units by analyzing gene-context patterns. Colinear syntenic blocks (CSBs) are groups of genes that are consistently encoded in the same neighborhood and in the same order across a wide range of taxa. Such CSBs are likely essential for the regulation of gene expression in prokaryotes. Recent results indicate that colinearity can be conserved across multiple operons, thus motivating the discovery of multi-operon CSBs. This computational task raises scalability challenges in large datasets. RESULTS: We propose an efficient algorithm for the discovery of cross-strand multi-operon CSBs in large genomic datasets. The proposed algorithm uses match-point arithmetic, which is scalable for large datasets of microbial genomes in terms of running time and space requirements. The algorithm is implemented and incorporated into a tool with a graphical user interface, called CSBFinder-S. We applied CSBFinder-S to data mine 1485 prokaryotic genomes and analyzed the identified cross-strand CSBs. Our results indicate that most of the syntenic blocks are exclusively colinear. Additional results indicate that transcriptional regulation by overlapping transcriptional genes is abundant in bacteria. We demonstrate the utility of CSBFinder-S to identify common function of the gene-pair PulEF in multiple contexts, including Type 2 Secretion System, Type 4 Pilus System and DNA uptake machinery. AVAILABILITY AND IMPLEMENTATION: CSBFinder-S software and code are publicly available at https://github.com/dinasv/CSBFinder. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2020-07 2020-07-13 /pmc/articles/PMC7355258/ /pubmed/32657415 http://dx.doi.org/10.1093/bioinformatics/btaa503 Text en © The Author(s) 2020. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Bioinformatics of Microbes and Microbiomes
Svetlitsky, Dina
Dagan, Tal
Ziv-Ukelson, Michal
Discovery of multi-operon colinear syntenic blocks in microbial genomes
title Discovery of multi-operon colinear syntenic blocks in microbial genomes
title_full Discovery of multi-operon colinear syntenic blocks in microbial genomes
title_fullStr Discovery of multi-operon colinear syntenic blocks in microbial genomes
title_full_unstemmed Discovery of multi-operon colinear syntenic blocks in microbial genomes
title_short Discovery of multi-operon colinear syntenic blocks in microbial genomes
title_sort discovery of multi-operon colinear syntenic blocks in microbial genomes
topic Bioinformatics of Microbes and Microbiomes
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7355258/
https://www.ncbi.nlm.nih.gov/pubmed/32657415
http://dx.doi.org/10.1093/bioinformatics/btaa503
work_keys_str_mv AT svetlitskydina discoveryofmultioperoncolinearsyntenicblocksinmicrobialgenomes
AT dagantal discoveryofmultioperoncolinearsyntenicblocksinmicrobialgenomes
AT zivukelsonmichal discoveryofmultioperoncolinearsyntenicblocksinmicrobialgenomes