Cargando…
APERO: a genome-wide approach for identifying bacterial small RNAs from RNA-Seq data
Small non-coding RNAs (sRNAs) regulate numerous cellular processes in all domains of life. Several approaches have been developed to identify them from RNA-seq data, which are efficient for eukaryotic sRNAs but remain inaccurate for the longer and highly structured bacterial sRNAs. We present APERO,...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6735904/ https://www.ncbi.nlm.nih.gov/pubmed/31147705 http://dx.doi.org/10.1093/nar/gkz485 |
_version_ | 1783450431985287168 |
---|---|
author | Leonard, Simon Meyer, Sam Lacour, Stephan Nasser, William Hommais, Florence Reverchon, Sylvie |
author_facet | Leonard, Simon Meyer, Sam Lacour, Stephan Nasser, William Hommais, Florence Reverchon, Sylvie |
author_sort | Leonard, Simon |
collection | PubMed |
description | Small non-coding RNAs (sRNAs) regulate numerous cellular processes in all domains of life. Several approaches have been developed to identify them from RNA-seq data, which are efficient for eukaryotic sRNAs but remain inaccurate for the longer and highly structured bacterial sRNAs. We present APERO, a new algorithm to detect small transcripts from paired-end bacterial RNA-seq data. In contrast to previous approaches that start from the read coverage distribution, APERO analyzes boundaries of individual sequenced fragments to infer the 5′ and 3′ ends of all transcripts. Since sRNAs are about the same size as individual fragments (50–350 nucleotides), this algorithm provides a significantly higher accuracy and robustness, e.g., with respect to spontaneous internal breaking sites. To demonstrate this improvement, we develop a comparative assessment on datasets from Escherichia coli and Salmonella enterica, based on experimentally validated sRNAs. We also identify the small transcript repertoire of Dickeya dadantii including putative intergenic RNAs, 5′ UTR or 3′ UTR-derived RNA products and antisense RNAs. Comparisons to annotations as well as RACE-PCR experimental data confirm the precision of the detected transcripts. Altogether, APERO outperforms all existing methods in terms of sRNA detection and boundary precision, which is crucial for comprehensive genome annotations. It is freely available as an open source R package on https://github.com/Simon-Leonard/APERO |
format | Online Article Text |
id | pubmed-6735904 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-67359042019-09-16 APERO: a genome-wide approach for identifying bacterial small RNAs from RNA-Seq data Leonard, Simon Meyer, Sam Lacour, Stephan Nasser, William Hommais, Florence Reverchon, Sylvie Nucleic Acids Res Methods Online Small non-coding RNAs (sRNAs) regulate numerous cellular processes in all domains of life. Several approaches have been developed to identify them from RNA-seq data, which are efficient for eukaryotic sRNAs but remain inaccurate for the longer and highly structured bacterial sRNAs. We present APERO, a new algorithm to detect small transcripts from paired-end bacterial RNA-seq data. In contrast to previous approaches that start from the read coverage distribution, APERO analyzes boundaries of individual sequenced fragments to infer the 5′ and 3′ ends of all transcripts. Since sRNAs are about the same size as individual fragments (50–350 nucleotides), this algorithm provides a significantly higher accuracy and robustness, e.g., with respect to spontaneous internal breaking sites. To demonstrate this improvement, we develop a comparative assessment on datasets from Escherichia coli and Salmonella enterica, based on experimentally validated sRNAs. We also identify the small transcript repertoire of Dickeya dadantii including putative intergenic RNAs, 5′ UTR or 3′ UTR-derived RNA products and antisense RNAs. Comparisons to annotations as well as RACE-PCR experimental data confirm the precision of the detected transcripts. Altogether, APERO outperforms all existing methods in terms of sRNA detection and boundary precision, which is crucial for comprehensive genome annotations. It is freely available as an open source R package on https://github.com/Simon-Leonard/APERO Oxford University Press 2019-09-05 2019-05-31 /pmc/articles/PMC6735904/ /pubmed/31147705 http://dx.doi.org/10.1093/nar/gkz485 Text en © The Author(s) 2019. Published by Oxford University Press on behalf of Nucleic Acids Research. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Methods Online Leonard, Simon Meyer, Sam Lacour, Stephan Nasser, William Hommais, Florence Reverchon, Sylvie APERO: a genome-wide approach for identifying bacterial small RNAs from RNA-Seq data |
title | APERO: a genome-wide approach for identifying bacterial small RNAs from RNA-Seq data |
title_full | APERO: a genome-wide approach for identifying bacterial small RNAs from RNA-Seq data |
title_fullStr | APERO: a genome-wide approach for identifying bacterial small RNAs from RNA-Seq data |
title_full_unstemmed | APERO: a genome-wide approach for identifying bacterial small RNAs from RNA-Seq data |
title_short | APERO: a genome-wide approach for identifying bacterial small RNAs from RNA-Seq data |
title_sort | apero: a genome-wide approach for identifying bacterial small rnas from rna-seq data |
topic | Methods Online |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6735904/ https://www.ncbi.nlm.nih.gov/pubmed/31147705 http://dx.doi.org/10.1093/nar/gkz485 |
work_keys_str_mv | AT leonardsimon aperoagenomewideapproachforidentifyingbacterialsmallrnasfromrnaseqdata AT meyersam aperoagenomewideapproachforidentifyingbacterialsmallrnasfromrnaseqdata AT lacourstephan aperoagenomewideapproachforidentifyingbacterialsmallrnasfromrnaseqdata AT nasserwilliam aperoagenomewideapproachforidentifyingbacterialsmallrnasfromrnaseqdata AT hommaisflorence aperoagenomewideapproachforidentifyingbacterialsmallrnasfromrnaseqdata AT reverchonsylvie aperoagenomewideapproachforidentifyingbacterialsmallrnasfromrnaseqdata |