Cargando…

APERO: a genome-wide approach for identifying bacterial small RNAs from RNA-Seq data

Small non-coding RNAs (sRNAs) regulate numerous cellular processes in all domains of life. Several approaches have been developed to identify them from RNA-seq data, which are efficient for eukaryotic sRNAs but remain inaccurate for the longer and highly structured bacterial sRNAs. We present APERO,...

Descripción completa

Detalles Bibliográficos
Autores principales: Leonard, Simon, Meyer, Sam, Lacour, Stephan, Nasser, William, Hommais, Florence, Reverchon, Sylvie
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6735904/
https://www.ncbi.nlm.nih.gov/pubmed/31147705
http://dx.doi.org/10.1093/nar/gkz485
_version_ 1783450431985287168
author Leonard, Simon
Meyer, Sam
Lacour, Stephan
Nasser, William
Hommais, Florence
Reverchon, Sylvie
author_facet Leonard, Simon
Meyer, Sam
Lacour, Stephan
Nasser, William
Hommais, Florence
Reverchon, Sylvie
author_sort Leonard, Simon
collection PubMed
description Small non-coding RNAs (sRNAs) regulate numerous cellular processes in all domains of life. Several approaches have been developed to identify them from RNA-seq data, which are efficient for eukaryotic sRNAs but remain inaccurate for the longer and highly structured bacterial sRNAs. We present APERO, a new algorithm to detect small transcripts from paired-end bacterial RNA-seq data. In contrast to previous approaches that start from the read coverage distribution, APERO analyzes boundaries of individual sequenced fragments to infer the 5′ and 3′ ends of all transcripts. Since sRNAs are about the same size as individual fragments (50–350 nucleotides), this algorithm provides a significantly higher accuracy and robustness, e.g., with respect to spontaneous internal breaking sites. To demonstrate this improvement, we develop a comparative assessment on datasets from Escherichia coli and Salmonella enterica, based on experimentally validated sRNAs. We also identify the small transcript repertoire of Dickeya dadantii including putative intergenic RNAs, 5′ UTR or 3′ UTR-derived RNA products and antisense RNAs. Comparisons to annotations as well as RACE-PCR experimental data confirm the precision of the detected transcripts. Altogether, APERO outperforms all existing methods in terms of sRNA detection and boundary precision, which is crucial for comprehensive genome annotations. It is freely available as an open source R package on https://github.com/Simon-Leonard/APERO
format Online
Article
Text
id pubmed-6735904
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-67359042019-09-16 APERO: a genome-wide approach for identifying bacterial small RNAs from RNA-Seq data Leonard, Simon Meyer, Sam Lacour, Stephan Nasser, William Hommais, Florence Reverchon, Sylvie Nucleic Acids Res Methods Online Small non-coding RNAs (sRNAs) regulate numerous cellular processes in all domains of life. Several approaches have been developed to identify them from RNA-seq data, which are efficient for eukaryotic sRNAs but remain inaccurate for the longer and highly structured bacterial sRNAs. We present APERO, a new algorithm to detect small transcripts from paired-end bacterial RNA-seq data. In contrast to previous approaches that start from the read coverage distribution, APERO analyzes boundaries of individual sequenced fragments to infer the 5′ and 3′ ends of all transcripts. Since sRNAs are about the same size as individual fragments (50–350 nucleotides), this algorithm provides a significantly higher accuracy and robustness, e.g., with respect to spontaneous internal breaking sites. To demonstrate this improvement, we develop a comparative assessment on datasets from Escherichia coli and Salmonella enterica, based on experimentally validated sRNAs. We also identify the small transcript repertoire of Dickeya dadantii including putative intergenic RNAs, 5′ UTR or 3′ UTR-derived RNA products and antisense RNAs. Comparisons to annotations as well as RACE-PCR experimental data confirm the precision of the detected transcripts. Altogether, APERO outperforms all existing methods in terms of sRNA detection and boundary precision, which is crucial for comprehensive genome annotations. It is freely available as an open source R package on https://github.com/Simon-Leonard/APERO Oxford University Press 2019-09-05 2019-05-31 /pmc/articles/PMC6735904/ /pubmed/31147705 http://dx.doi.org/10.1093/nar/gkz485 Text en © The Author(s) 2019. Published by Oxford University Press on behalf of Nucleic Acids Research. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Methods Online
Leonard, Simon
Meyer, Sam
Lacour, Stephan
Nasser, William
Hommais, Florence
Reverchon, Sylvie
APERO: a genome-wide approach for identifying bacterial small RNAs from RNA-Seq data
title APERO: a genome-wide approach for identifying bacterial small RNAs from RNA-Seq data
title_full APERO: a genome-wide approach for identifying bacterial small RNAs from RNA-Seq data
title_fullStr APERO: a genome-wide approach for identifying bacterial small RNAs from RNA-Seq data
title_full_unstemmed APERO: a genome-wide approach for identifying bacterial small RNAs from RNA-Seq data
title_short APERO: a genome-wide approach for identifying bacterial small RNAs from RNA-Seq data
title_sort apero: a genome-wide approach for identifying bacterial small rnas from rna-seq data
topic Methods Online
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6735904/
https://www.ncbi.nlm.nih.gov/pubmed/31147705
http://dx.doi.org/10.1093/nar/gkz485
work_keys_str_mv AT leonardsimon aperoagenomewideapproachforidentifyingbacterialsmallrnasfromrnaseqdata
AT meyersam aperoagenomewideapproachforidentifyingbacterialsmallrnasfromrnaseqdata
AT lacourstephan aperoagenomewideapproachforidentifyingbacterialsmallrnasfromrnaseqdata
AT nasserwilliam aperoagenomewideapproachforidentifyingbacterialsmallrnasfromrnaseqdata
AT hommaisflorence aperoagenomewideapproachforidentifyingbacterialsmallrnasfromrnaseqdata
AT reverchonsylvie aperoagenomewideapproachforidentifyingbacterialsmallrnasfromrnaseqdata