Cargando…

Probabilistic identification of bacterial essential genes via insertion density using TraDIS data with Tn5 libraries

MOTIVATION: Probabilistic Identification of bacterial essential genes using transposon-directed insertion-site sequencing (TraDIS) data based on Tn5 libraries has received relatively little attention in the literature; most methods are designed for mariner transposon insertions. Analysis of Tn5 tran...

Descripción completa

Detalles Bibliográficos
Autores principales:	Nlebedim, Valentine U, Chaudhuri, Roy R, Walters, Kevin
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Oxford University Press 2021
Materias:	Original Papers
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8652038/ https://www.ncbi.nlm.nih.gov/pubmed/34255819 http://dx.doi.org/10.1093/bioinformatics/btab508

_version_	1784611505399922688
author	Nlebedim, Valentine U Chaudhuri, Roy R Walters, Kevin
author_facet	Nlebedim, Valentine U Chaudhuri, Roy R Walters, Kevin
author_sort	Nlebedim, Valentine U
collection	PubMed
description	MOTIVATION: Probabilistic Identification of bacterial essential genes using transposon-directed insertion-site sequencing (TraDIS) data based on Tn5 libraries has received relatively little attention in the literature; most methods are designed for mariner transposon insertions. Analysis of Tn5 transposon-based genomic data is challenging due to the high insertion density and genomic resolution. We present a novel probabilistic Bayesian approach for classifying bacterial essential genes using transposon insertion density derived from transposon insertion sequencing data. We implement a Markov chain Monte Carlo sampling procedure to estimate the posterior probability that any given gene is essential. We implement a Bayesian decision theory approach to selecting essential genes. We assess the effectiveness of our approach via analysis of both simulated data and three previously published Escherichia coli, Salmonella Typhimurium and Staphylococcus aureus datasets. These three bacteria have relatively well characterized essential genes which allows us to test our classification procedure using receiver operating characteristic curves and area under the curves. We compare the classification performance with that of Bio-Tradis, a standard tool for bacterial gene classification. RESULTS: Our method is able to classify genes in the three datasets with areas under the curves between 0.967 and 0.983. Our simulated synthetic datasets show that both the number of insertions and the extent to which insertions are tolerated in the distal regions of essential genes are both important in determining classification accuracy. Importantly our method gives the user the option of classifying essential genes based on the user-supplied costs of false discovery and false non-discovery. AVAILABILITY AND IMPLEMENTATION: An R package that implements the method presented in this paper is available for download from https://github.com/Kevin-walters/insdens. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format	Online Article Text
id	pubmed-8652038
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	Oxford University Press
record_format	MEDLINE/PubMed
spelling	pubmed-86520382021-12-08 Probabilistic identification of bacterial essential genes via insertion density using TraDIS data with Tn5 libraries Nlebedim, Valentine U Chaudhuri, Roy R Walters, Kevin Bioinformatics Original Papers MOTIVATION: Probabilistic Identification of bacterial essential genes using transposon-directed insertion-site sequencing (TraDIS) data based on Tn5 libraries has received relatively little attention in the literature; most methods are designed for mariner transposon insertions. Analysis of Tn5 transposon-based genomic data is challenging due to the high insertion density and genomic resolution. We present a novel probabilistic Bayesian approach for classifying bacterial essential genes using transposon insertion density derived from transposon insertion sequencing data. We implement a Markov chain Monte Carlo sampling procedure to estimate the posterior probability that any given gene is essential. We implement a Bayesian decision theory approach to selecting essential genes. We assess the effectiveness of our approach via analysis of both simulated data and three previously published Escherichia coli, Salmonella Typhimurium and Staphylococcus aureus datasets. These three bacteria have relatively well characterized essential genes which allows us to test our classification procedure using receiver operating characteristic curves and area under the curves. We compare the classification performance with that of Bio-Tradis, a standard tool for bacterial gene classification. RESULTS: Our method is able to classify genes in the three datasets with areas under the curves between 0.967 and 0.983. Our simulated synthetic datasets show that both the number of insertions and the extent to which insertions are tolerated in the distal regions of essential genes are both important in determining classification accuracy. Importantly our method gives the user the option of classifying essential genes based on the user-supplied costs of false discovery and false non-discovery. AVAILABILITY AND IMPLEMENTATION: An R package that implements the method presented in this paper is available for download from https://github.com/Kevin-walters/insdens. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2021-07-13 /pmc/articles/PMC8652038/ /pubmed/34255819 http://dx.doi.org/10.1093/bioinformatics/btab508 Text en © The Author(s) 2021. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Original Papers Nlebedim, Valentine U Chaudhuri, Roy R Walters, Kevin Probabilistic identification of bacterial essential genes via insertion density using TraDIS data with Tn5 libraries
title	Probabilistic identification of bacterial essential genes via insertion density using TraDIS data with Tn5 libraries
title_full	Probabilistic identification of bacterial essential genes via insertion density using TraDIS data with Tn5 libraries
title_fullStr	Probabilistic identification of bacterial essential genes via insertion density using TraDIS data with Tn5 libraries
title_full_unstemmed	Probabilistic identification of bacterial essential genes via insertion density using TraDIS data with Tn5 libraries
title_short	Probabilistic identification of bacterial essential genes via insertion density using TraDIS data with Tn5 libraries
title_sort	probabilistic identification of bacterial essential genes via insertion density using tradis data with tn5 libraries
topic	Original Papers
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8652038/ https://www.ncbi.nlm.nih.gov/pubmed/34255819 http://dx.doi.org/10.1093/bioinformatics/btab508
work_keys_str_mv	AT nlebedimvalentineu probabilisticidentificationofbacterialessentialgenesviainsertiondensityusingtradisdatawithtn5libraries AT chaudhuriroyr probabilisticidentificationofbacterialessentialgenesviainsertiondensityusingtradisdatawithtn5libraries AT walterskevin probabilisticidentificationofbacterialessentialgenesviainsertiondensityusingtradisdatawithtn5libraries

Probabilistic identification of bacterial essential genes via insertion density using TraDIS data with Tn5 libraries

Ejemplares similares