Cargando…

FADU: a Quantification Tool for Prokaryotic Transcriptomic Analyses

Quantification tools for RNA sequencing (RNA-Seq) analyses are often designed and tested using human transcriptomics data sets, in which full-length transcript sequences are well annotated. For prokaryotic transcriptomics experiments, full-length transcript sequences are seldom known, and coding seq...

Descripción completa

Detalles Bibliográficos
Autores principales: Chung, Matthew, Adkins, Ricky S., Mattick, John S. A., Bradwell, Katie R., Shetty, Amol C., Sadzewicz, Lisa, Tallon, Luke J., Fraser, Claire M., Rasko, David A., Mahurkar, Anup, Dunning Hotopp, Julie C.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Society for Microbiology 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7901478/
https://www.ncbi.nlm.nih.gov/pubmed/33436511
http://dx.doi.org/10.1128/mSystems.00917-20
_version_ 1783654382263336960
author Chung, Matthew
Adkins, Ricky S.
Mattick, John S. A.
Bradwell, Katie R.
Shetty, Amol C.
Sadzewicz, Lisa
Tallon, Luke J.
Fraser, Claire M.
Rasko, David A.
Mahurkar, Anup
Dunning Hotopp, Julie C.
author_facet Chung, Matthew
Adkins, Ricky S.
Mattick, John S. A.
Bradwell, Katie R.
Shetty, Amol C.
Sadzewicz, Lisa
Tallon, Luke J.
Fraser, Claire M.
Rasko, David A.
Mahurkar, Anup
Dunning Hotopp, Julie C.
author_sort Chung, Matthew
collection PubMed
description Quantification tools for RNA sequencing (RNA-Seq) analyses are often designed and tested using human transcriptomics data sets, in which full-length transcript sequences are well annotated. For prokaryotic transcriptomics experiments, full-length transcript sequences are seldom known, and coding sequences must instead be used for quantification steps in RNA-Seq analyses. However, operons confound accurate quantification of coding sequences since a single transcript does not necessarily equate to a single gene. Here, we introduce FADU (Feature Aggregate Depth Utility), a quantification tool designed specifically for prokaryotic RNA-Seq analyses. FADU assigns partial count values proportional to the length of the fragment overlapping the target feature. To assess the ability of FADU to quantify genes in prokaryotic transcriptomics analyses, we compared its performance to those of eXpress, featureCounts, HTSeq, kallisto, and Salmon across three paired-end read data sets of (i) Ehrlichia chaffeensis, (ii) Escherichia coli, and (iii) the Wolbachia endosymbiont wBm. Across each of the three data sets, we find that FADU can more accurately quantify operonic genes by deriving proportional counts for multigene fragments within operons. FADU is available at https://github.com/IGS/FADU. IMPORTANCE Most currently available quantification tools for transcriptomics analyses have been designed for human data sets, in which full-length transcript sequences, including the untranslated regions, are well annotated. In most prokaryotic systems, full-length transcript sequences have yet to be characterized, leading to prokaryotic transcriptomics analyses being performed based on only the coding sequences. In contrast to eukaryotes, prokaryotes contain polycistronic transcripts, and when genes are quantified based on coding sequences instead of transcript sequences, this leads to an increased abundance of improperly assigned ambiguous multigene fragments, specifically those mapping to multiple genes in operons. Here, we describe FADU, a quantification tool for prokaryotic RNA-Seq analyses designed to assign proportional counts with the purpose of better quantifying operonic genes while minimizing the pitfalls associated with improperly assigning fragment counts from ambiguous transcripts.
format Online
Article
Text
id pubmed-7901478
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher American Society for Microbiology
record_format MEDLINE/PubMed
spelling pubmed-79014782021-02-24 FADU: a Quantification Tool for Prokaryotic Transcriptomic Analyses Chung, Matthew Adkins, Ricky S. Mattick, John S. A. Bradwell, Katie R. Shetty, Amol C. Sadzewicz, Lisa Tallon, Luke J. Fraser, Claire M. Rasko, David A. Mahurkar, Anup Dunning Hotopp, Julie C. mSystems Research Article Quantification tools for RNA sequencing (RNA-Seq) analyses are often designed and tested using human transcriptomics data sets, in which full-length transcript sequences are well annotated. For prokaryotic transcriptomics experiments, full-length transcript sequences are seldom known, and coding sequences must instead be used for quantification steps in RNA-Seq analyses. However, operons confound accurate quantification of coding sequences since a single transcript does not necessarily equate to a single gene. Here, we introduce FADU (Feature Aggregate Depth Utility), a quantification tool designed specifically for prokaryotic RNA-Seq analyses. FADU assigns partial count values proportional to the length of the fragment overlapping the target feature. To assess the ability of FADU to quantify genes in prokaryotic transcriptomics analyses, we compared its performance to those of eXpress, featureCounts, HTSeq, kallisto, and Salmon across three paired-end read data sets of (i) Ehrlichia chaffeensis, (ii) Escherichia coli, and (iii) the Wolbachia endosymbiont wBm. Across each of the three data sets, we find that FADU can more accurately quantify operonic genes by deriving proportional counts for multigene fragments within operons. FADU is available at https://github.com/IGS/FADU. IMPORTANCE Most currently available quantification tools for transcriptomics analyses have been designed for human data sets, in which full-length transcript sequences, including the untranslated regions, are well annotated. In most prokaryotic systems, full-length transcript sequences have yet to be characterized, leading to prokaryotic transcriptomics analyses being performed based on only the coding sequences. In contrast to eukaryotes, prokaryotes contain polycistronic transcripts, and when genes are quantified based on coding sequences instead of transcript sequences, this leads to an increased abundance of improperly assigned ambiguous multigene fragments, specifically those mapping to multiple genes in operons. Here, we describe FADU, a quantification tool for prokaryotic RNA-Seq analyses designed to assign proportional counts with the purpose of better quantifying operonic genes while minimizing the pitfalls associated with improperly assigning fragment counts from ambiguous transcripts. American Society for Microbiology 2021-01-12 /pmc/articles/PMC7901478/ /pubmed/33436511 http://dx.doi.org/10.1128/mSystems.00917-20 Text en Copyright © 2021 Chung et al. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International license (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Research Article
Chung, Matthew
Adkins, Ricky S.
Mattick, John S. A.
Bradwell, Katie R.
Shetty, Amol C.
Sadzewicz, Lisa
Tallon, Luke J.
Fraser, Claire M.
Rasko, David A.
Mahurkar, Anup
Dunning Hotopp, Julie C.
FADU: a Quantification Tool for Prokaryotic Transcriptomic Analyses
title FADU: a Quantification Tool for Prokaryotic Transcriptomic Analyses
title_full FADU: a Quantification Tool for Prokaryotic Transcriptomic Analyses
title_fullStr FADU: a Quantification Tool for Prokaryotic Transcriptomic Analyses
title_full_unstemmed FADU: a Quantification Tool for Prokaryotic Transcriptomic Analyses
title_short FADU: a Quantification Tool for Prokaryotic Transcriptomic Analyses
title_sort fadu: a quantification tool for prokaryotic transcriptomic analyses
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7901478/
https://www.ncbi.nlm.nih.gov/pubmed/33436511
http://dx.doi.org/10.1128/mSystems.00917-20
work_keys_str_mv AT chungmatthew faduaquantificationtoolforprokaryotictranscriptomicanalyses
AT adkinsrickys faduaquantificationtoolforprokaryotictranscriptomicanalyses
AT mattickjohnsa faduaquantificationtoolforprokaryotictranscriptomicanalyses
AT bradwellkatier faduaquantificationtoolforprokaryotictranscriptomicanalyses
AT shettyamolc faduaquantificationtoolforprokaryotictranscriptomicanalyses
AT sadzewiczlisa faduaquantificationtoolforprokaryotictranscriptomicanalyses
AT tallonlukej faduaquantificationtoolforprokaryotictranscriptomicanalyses
AT fraserclairem faduaquantificationtoolforprokaryotictranscriptomicanalyses
AT raskodavida faduaquantificationtoolforprokaryotictranscriptomicanalyses
AT mahurkaranup faduaquantificationtoolforprokaryotictranscriptomicanalyses
AT dunninghotoppjuliec faduaquantificationtoolforprokaryotictranscriptomicanalyses