Cargando…

reactIDR: evaluation of the statistical reproducibility of high-throughput structural analyses towards a robust RNA structure prediction

BACKGROUND: Recently, next-generation sequencing techniques have been applied for the detection of RNA secondary structures, which is referred to as high-throughput RNA structural (HTS) analyses, and many different protocols have been used to detect comprehensive RNA structures at single-nucleotide...

Descripción completa

Detalles Bibliográficos
Autores principales: Kawaguchi, Risa, Kiryu, Hisanori, Iwakiri, Junichi, Sese, Jun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6439966/
https://www.ncbi.nlm.nih.gov/pubmed/30925857
http://dx.doi.org/10.1186/s12859-019-2645-4
_version_ 1783407300522803200
author Kawaguchi, Risa
Kiryu, Hisanori
Iwakiri, Junichi
Sese, Jun
author_facet Kawaguchi, Risa
Kiryu, Hisanori
Iwakiri, Junichi
Sese, Jun
author_sort Kawaguchi, Risa
collection PubMed
description BACKGROUND: Recently, next-generation sequencing techniques have been applied for the detection of RNA secondary structures, which is referred to as high-throughput RNA structural (HTS) analyses, and many different protocols have been used to detect comprehensive RNA structures at single-nucleotide resolution. However, the existing computational analyses heavily depend on the experimental methodology to generate data, which results in difficulties associated with statistically sound comparisons or combining the results obtained using different HTS methods. RESULTS: Here, we introduced a statistical framework, reactIDR, which can be applied to the experimental data obtained using multiple HTS methodologies. Using this approach, nucleotides are classified into three structural categories, loop, stem/background, and unmapped. reactIDR uses the irreproducible discovery rate (IDR) with a hidden Markov model to discriminate between the true and spurious signals obtained in the replicated HTS experiments accurately, and it is able to incorporate an expectation-maximization algorithm and supervised learning for efficient parameter optimization. The results of our analyses of the real-life HTS data showed that reactIDR had the highest accuracy in the classification of ribosomal RNA stem/loop structures when using both individual and integrated HTS datasets, and its results corresponded the best to the three-dimensional structures. CONCLUSIONS: We have developed a novel software, reactIDR, for the prediction of stem/loop regions from the HTS analysis datasets. For the rRNA structure analyses, reactIDR was shown to have robust accuracy across different datasets by using the reproducibility criterion, suggesting its potential for increasing the value of existing HTS datasets. reactIDR is publicly available at https://github.com/carushi/reactIDR. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-019-2645-4) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-6439966
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-64399662019-04-11 reactIDR: evaluation of the statistical reproducibility of high-throughput structural analyses towards a robust RNA structure prediction Kawaguchi, Risa Kiryu, Hisanori Iwakiri, Junichi Sese, Jun BMC Bioinformatics Research BACKGROUND: Recently, next-generation sequencing techniques have been applied for the detection of RNA secondary structures, which is referred to as high-throughput RNA structural (HTS) analyses, and many different protocols have been used to detect comprehensive RNA structures at single-nucleotide resolution. However, the existing computational analyses heavily depend on the experimental methodology to generate data, which results in difficulties associated with statistically sound comparisons or combining the results obtained using different HTS methods. RESULTS: Here, we introduced a statistical framework, reactIDR, which can be applied to the experimental data obtained using multiple HTS methodologies. Using this approach, nucleotides are classified into three structural categories, loop, stem/background, and unmapped. reactIDR uses the irreproducible discovery rate (IDR) with a hidden Markov model to discriminate between the true and spurious signals obtained in the replicated HTS experiments accurately, and it is able to incorporate an expectation-maximization algorithm and supervised learning for efficient parameter optimization. The results of our analyses of the real-life HTS data showed that reactIDR had the highest accuracy in the classification of ribosomal RNA stem/loop structures when using both individual and integrated HTS datasets, and its results corresponded the best to the three-dimensional structures. CONCLUSIONS: We have developed a novel software, reactIDR, for the prediction of stem/loop regions from the HTS analysis datasets. For the rRNA structure analyses, reactIDR was shown to have robust accuracy across different datasets by using the reproducibility criterion, suggesting its potential for increasing the value of existing HTS datasets. reactIDR is publicly available at https://github.com/carushi/reactIDR. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-019-2645-4) contains supplementary material, which is available to authorized users. BioMed Central 2019-03-29 /pmc/articles/PMC6439966/ /pubmed/30925857 http://dx.doi.org/10.1186/s12859-019-2645-4 Text en © The Author(s) 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Kawaguchi, Risa
Kiryu, Hisanori
Iwakiri, Junichi
Sese, Jun
reactIDR: evaluation of the statistical reproducibility of high-throughput structural analyses towards a robust RNA structure prediction
title reactIDR: evaluation of the statistical reproducibility of high-throughput structural analyses towards a robust RNA structure prediction
title_full reactIDR: evaluation of the statistical reproducibility of high-throughput structural analyses towards a robust RNA structure prediction
title_fullStr reactIDR: evaluation of the statistical reproducibility of high-throughput structural analyses towards a robust RNA structure prediction
title_full_unstemmed reactIDR: evaluation of the statistical reproducibility of high-throughput structural analyses towards a robust RNA structure prediction
title_short reactIDR: evaluation of the statistical reproducibility of high-throughput structural analyses towards a robust RNA structure prediction
title_sort reactidr: evaluation of the statistical reproducibility of high-throughput structural analyses towards a robust rna structure prediction
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6439966/
https://www.ncbi.nlm.nih.gov/pubmed/30925857
http://dx.doi.org/10.1186/s12859-019-2645-4
work_keys_str_mv AT kawaguchirisa reactidrevaluationofthestatisticalreproducibilityofhighthroughputstructuralanalysestowardsarobustrnastructureprediction
AT kiryuhisanori reactidrevaluationofthestatisticalreproducibilityofhighthroughputstructuralanalysestowardsarobustrnastructureprediction
AT iwakirijunichi reactidrevaluationofthestatisticalreproducibilityofhighthroughputstructuralanalysestowardsarobustrnastructureprediction
AT sesejun reactidrevaluationofthestatisticalreproducibilityofhighthroughputstructuralanalysestowardsarobustrnastructureprediction