Cargando…

digIS: towards detecting distant and putative novel insertion sequence elements in prokaryotic genomes

BACKGROUND: The insertion sequence elements (IS elements) represent the smallest and the most abundant mobile elements in prokaryotic genomes. It has been shown that they play a significant role in genome organization and evolution. To better understand their function in the host genome, it is desir...

Descripción completa

Detalles Bibliográficos
Autores principales: Puterová, Janka, Martínek, Tomáš
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8147514/
https://www.ncbi.nlm.nih.gov/pubmed/34016050
http://dx.doi.org/10.1186/s12859-021-04177-6
_version_ 1783697645502464000
author Puterová, Janka
Martínek, Tomáš
author_facet Puterová, Janka
Martínek, Tomáš
author_sort Puterová, Janka
collection PubMed
description BACKGROUND: The insertion sequence elements (IS elements) represent the smallest and the most abundant mobile elements in prokaryotic genomes. It has been shown that they play a significant role in genome organization and evolution. To better understand their function in the host genome, it is desirable to have an effective detection and annotation tool. This need becomes even more crucial when considering rapid-growing genomic and metagenomic data. The existing tools for IS elements detection and annotation are usually based on comparing sequence similarity with a database of known IS families. Thus, they have limited ability to discover distant and putative novel IS elements. RESULTS: In this paper, we present digIS, a software tool based on profile hidden Markov models assembled from catalytic domains of transposases. It shows a very good performance in detecting known IS elements when tested on datasets with manually curated annotation. The main contribution of digIS is in its ability to detect distant and putative novel IS elements while maintaining a moderate level of false positives. In this category it outperforms existing tools, especially when tested on large datasets of archaeal and bacterial genomes. CONCLUSION: We provide digIS, a software tool using a novel approach based on manually curated profile hidden Markov models, which is able to detect distant and putative novel IS elements. Although digIS can find known IS elements as well, we expect it to be used primarily by scientists interested in finding novel IS elements. The tool is available at https://github.com/janka2012/digIS. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-021-04177-6.
format Online
Article
Text
id pubmed-8147514
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-81475142021-05-26 digIS: towards detecting distant and putative novel insertion sequence elements in prokaryotic genomes Puterová, Janka Martínek, Tomáš BMC Bioinformatics Software BACKGROUND: The insertion sequence elements (IS elements) represent the smallest and the most abundant mobile elements in prokaryotic genomes. It has been shown that they play a significant role in genome organization and evolution. To better understand their function in the host genome, it is desirable to have an effective detection and annotation tool. This need becomes even more crucial when considering rapid-growing genomic and metagenomic data. The existing tools for IS elements detection and annotation are usually based on comparing sequence similarity with a database of known IS families. Thus, they have limited ability to discover distant and putative novel IS elements. RESULTS: In this paper, we present digIS, a software tool based on profile hidden Markov models assembled from catalytic domains of transposases. It shows a very good performance in detecting known IS elements when tested on datasets with manually curated annotation. The main contribution of digIS is in its ability to detect distant and putative novel IS elements while maintaining a moderate level of false positives. In this category it outperforms existing tools, especially when tested on large datasets of archaeal and bacterial genomes. CONCLUSION: We provide digIS, a software tool using a novel approach based on manually curated profile hidden Markov models, which is able to detect distant and putative novel IS elements. Although digIS can find known IS elements as well, we expect it to be used primarily by scientists interested in finding novel IS elements. The tool is available at https://github.com/janka2012/digIS. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-021-04177-6. BioMed Central 2021-05-20 /pmc/articles/PMC8147514/ /pubmed/34016050 http://dx.doi.org/10.1186/s12859-021-04177-6 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Software
Puterová, Janka
Martínek, Tomáš
digIS: towards detecting distant and putative novel insertion sequence elements in prokaryotic genomes
title digIS: towards detecting distant and putative novel insertion sequence elements in prokaryotic genomes
title_full digIS: towards detecting distant and putative novel insertion sequence elements in prokaryotic genomes
title_fullStr digIS: towards detecting distant and putative novel insertion sequence elements in prokaryotic genomes
title_full_unstemmed digIS: towards detecting distant and putative novel insertion sequence elements in prokaryotic genomes
title_short digIS: towards detecting distant and putative novel insertion sequence elements in prokaryotic genomes
title_sort digis: towards detecting distant and putative novel insertion sequence elements in prokaryotic genomes
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8147514/
https://www.ncbi.nlm.nih.gov/pubmed/34016050
http://dx.doi.org/10.1186/s12859-021-04177-6
work_keys_str_mv AT puterovajanka digistowardsdetectingdistantandputativenovelinsertionsequenceelementsinprokaryoticgenomes
AT martinektomas digistowardsdetectingdistantandputativenovelinsertionsequenceelementsinprokaryoticgenomes