Cargando…

Tatajuba: exploring the distribution of homopolymer tracts

Length variation of homopolymeric tracts, which induces phase variation, is known to regulate gene expression leading to phenotypic variation in a wide range of bacterial species. There is no specialized bioinformatics software which can, at scale, exhaustively explore and describe these features fr...

Descripción completa

Detalles Bibliográficos
Autores principales: de Oliveira Martins, Leonardo, Bloomfield, Samuel, Stoakes, Emily, Grant, Andrew J, Page, Andrew J, Mather, Alison E
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8808543/
https://www.ncbi.nlm.nih.gov/pubmed/35118377
http://dx.doi.org/10.1093/nargab/lqac003
_version_ 1784643898883178496
author de Oliveira Martins, Leonardo
Bloomfield, Samuel
Stoakes, Emily
Grant, Andrew J
Page, Andrew J
Mather, Alison E
author_facet de Oliveira Martins, Leonardo
Bloomfield, Samuel
Stoakes, Emily
Grant, Andrew J
Page, Andrew J
Mather, Alison E
author_sort de Oliveira Martins, Leonardo
collection PubMed
description Length variation of homopolymeric tracts, which induces phase variation, is known to regulate gene expression leading to phenotypic variation in a wide range of bacterial species. There is no specialized bioinformatics software which can, at scale, exhaustively explore and describe these features from sequencing data. Identifying these is non-trivial as sequencing and bioinformatics methods are prone to introducing artefacts when presented with homopolymeric tracts due to the decreased base diversity. We present tatajuba, which can automatically identify potential homopolymeric tracts and help predict their putative phenotypic impact, allowing for rapid investigation. We use it to detect all tracts in two separate datasets, one of Campylobacter jejuni and one of three Bordetella species, and to highlight those tracts that are polymorphic across samples. With this we confirm homopolymer tract variation with phenotypic impact found in previous studies and additionally find many more with potential variability. The software is written in C and is available under the open source licence GNU GPLv3.
format Online
Article
Text
id pubmed-8808543
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-88085432022-02-02 Tatajuba: exploring the distribution of homopolymer tracts de Oliveira Martins, Leonardo Bloomfield, Samuel Stoakes, Emily Grant, Andrew J Page, Andrew J Mather, Alison E NAR Genom Bioinform Standard Article Length variation of homopolymeric tracts, which induces phase variation, is known to regulate gene expression leading to phenotypic variation in a wide range of bacterial species. There is no specialized bioinformatics software which can, at scale, exhaustively explore and describe these features from sequencing data. Identifying these is non-trivial as sequencing and bioinformatics methods are prone to introducing artefacts when presented with homopolymeric tracts due to the decreased base diversity. We present tatajuba, which can automatically identify potential homopolymeric tracts and help predict their putative phenotypic impact, allowing for rapid investigation. We use it to detect all tracts in two separate datasets, one of Campylobacter jejuni and one of three Bordetella species, and to highlight those tracts that are polymorphic across samples. With this we confirm homopolymer tract variation with phenotypic impact found in previous studies and additionally find many more with potential variability. The software is written in C and is available under the open source licence GNU GPLv3. Oxford University Press 2022-02-02 /pmc/articles/PMC8808543/ /pubmed/35118377 http://dx.doi.org/10.1093/nargab/lqac003 Text en © The Author(s) 2022. Published by Oxford University Press on behalf of NAR Genomics and Bioinformatics. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Standard Article
de Oliveira Martins, Leonardo
Bloomfield, Samuel
Stoakes, Emily
Grant, Andrew J
Page, Andrew J
Mather, Alison E
Tatajuba: exploring the distribution of homopolymer tracts
title Tatajuba: exploring the distribution of homopolymer tracts
title_full Tatajuba: exploring the distribution of homopolymer tracts
title_fullStr Tatajuba: exploring the distribution of homopolymer tracts
title_full_unstemmed Tatajuba: exploring the distribution of homopolymer tracts
title_short Tatajuba: exploring the distribution of homopolymer tracts
title_sort tatajuba: exploring the distribution of homopolymer tracts
topic Standard Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8808543/
https://www.ncbi.nlm.nih.gov/pubmed/35118377
http://dx.doi.org/10.1093/nargab/lqac003
work_keys_str_mv AT deoliveiramartinsleonardo tatajubaexploringthedistributionofhomopolymertracts
AT bloomfieldsamuel tatajubaexploringthedistributionofhomopolymertracts
AT stoakesemily tatajubaexploringthedistributionofhomopolymertracts
AT grantandrewj tatajubaexploringthedistributionofhomopolymertracts
AT pageandrewj tatajubaexploringthedistributionofhomopolymertracts
AT matheralisone tatajubaexploringthedistributionofhomopolymertracts