Cargando…

iMir: An integrated pipeline for high-throughput analysis of small non-coding RNA data obtained by smallRNA-Seq

BACKGROUND: Qualitative and quantitative analysis of small non-coding RNAs by next generation sequencing (smallRNA-Seq) represents a novel technology increasingly used to investigate with high sensitivity and specificity RNA population comprising microRNAs and other regulatory small transcripts. Ana...

Descripción completa

Detalles Bibliográficos
Autores principales: Giurato, Giorgio, De Filippo, Maria Rosaria, Rinaldi, Antonio, Hashim, Adnan, Nassa, Giovanni, Ravo, Maria, Rizzo, Francesca, Tarallo, Roberta, Weisz, Alessandro
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3878829/
https://www.ncbi.nlm.nih.gov/pubmed/24330401
http://dx.doi.org/10.1186/1471-2105-14-362
_version_ 1782297873492213760
author Giurato, Giorgio
De Filippo, Maria Rosaria
Rinaldi, Antonio
Hashim, Adnan
Nassa, Giovanni
Ravo, Maria
Rizzo, Francesca
Tarallo, Roberta
Weisz, Alessandro
author_facet Giurato, Giorgio
De Filippo, Maria Rosaria
Rinaldi, Antonio
Hashim, Adnan
Nassa, Giovanni
Ravo, Maria
Rizzo, Francesca
Tarallo, Roberta
Weisz, Alessandro
author_sort Giurato, Giorgio
collection PubMed
description BACKGROUND: Qualitative and quantitative analysis of small non-coding RNAs by next generation sequencing (smallRNA-Seq) represents a novel technology increasingly used to investigate with high sensitivity and specificity RNA population comprising microRNAs and other regulatory small transcripts. Analysis of smallRNA-Seq data to gather biologically relevant information, i.e. detection and differential expression analysis of known and novel non-coding RNAs, target prediction, etc., requires implementation of multiple statistical and bioinformatics tools from different sources, each focusing on a specific step of the analysis pipeline. As a consequence, the analytical workflow is slowed down by the need for continuous interventions by the operator, a critical factor when large numbers of datasets need to be analyzed at once. RESULTS: We designed a novel modular pipeline (iMir) for comprehensive analysis of smallRNA-Seq data, comprising specific tools for adapter trimming, quality filtering, differential expression analysis, biological target prediction and other useful options by integrating multiple open source modules and resources in an automated workflow. As statistics is crucial in deep-sequencing data analysis, we devised and integrated in iMir tools based on different statistical approaches to allow the operator to analyze data rigorously. The pipeline created here proved to be efficient and time-saving than currently available methods and, in addition, flexible enough to allow the user to select the preferred combination of analytical steps. We present here the results obtained by applying this pipeline to analyze simultaneously 6 smallRNA-Seq datasets from either exponentially growing or growth-arrested human breast cancer MCF-7 cells, that led to the rapid and accurate identification, quantitation and differential expression analysis of ~450 miRNAs, including several novel miRNAs and isomiRs, as well as identification of the putative mRNA targets of differentially expressed miRNAs. In addition, iMir allowed also the identification of ~70 piRNAs (piwi-interacting RNAs), some of which differentially expressed in proliferating vs growth arrested cells. CONCLUSION: The integrated data analysis pipeline described here is based on a reliable, flexible and fully automated workflow, useful to rapidly and efficiently analyze high-throughput smallRNA-Seq data, such as those produced by the most recent high-performance next generation sequencers. iMir is available at http://www.labmedmolge.unisa.it/inglese/research/imir.
format Online
Article
Text
id pubmed-3878829
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-38788292014-01-03 iMir: An integrated pipeline for high-throughput analysis of small non-coding RNA data obtained by smallRNA-Seq Giurato, Giorgio De Filippo, Maria Rosaria Rinaldi, Antonio Hashim, Adnan Nassa, Giovanni Ravo, Maria Rizzo, Francesca Tarallo, Roberta Weisz, Alessandro BMC Bioinformatics Software BACKGROUND: Qualitative and quantitative analysis of small non-coding RNAs by next generation sequencing (smallRNA-Seq) represents a novel technology increasingly used to investigate with high sensitivity and specificity RNA population comprising microRNAs and other regulatory small transcripts. Analysis of smallRNA-Seq data to gather biologically relevant information, i.e. detection and differential expression analysis of known and novel non-coding RNAs, target prediction, etc., requires implementation of multiple statistical and bioinformatics tools from different sources, each focusing on a specific step of the analysis pipeline. As a consequence, the analytical workflow is slowed down by the need for continuous interventions by the operator, a critical factor when large numbers of datasets need to be analyzed at once. RESULTS: We designed a novel modular pipeline (iMir) for comprehensive analysis of smallRNA-Seq data, comprising specific tools for adapter trimming, quality filtering, differential expression analysis, biological target prediction and other useful options by integrating multiple open source modules and resources in an automated workflow. As statistics is crucial in deep-sequencing data analysis, we devised and integrated in iMir tools based on different statistical approaches to allow the operator to analyze data rigorously. The pipeline created here proved to be efficient and time-saving than currently available methods and, in addition, flexible enough to allow the user to select the preferred combination of analytical steps. We present here the results obtained by applying this pipeline to analyze simultaneously 6 smallRNA-Seq datasets from either exponentially growing or growth-arrested human breast cancer MCF-7 cells, that led to the rapid and accurate identification, quantitation and differential expression analysis of ~450 miRNAs, including several novel miRNAs and isomiRs, as well as identification of the putative mRNA targets of differentially expressed miRNAs. In addition, iMir allowed also the identification of ~70 piRNAs (piwi-interacting RNAs), some of which differentially expressed in proliferating vs growth arrested cells. CONCLUSION: The integrated data analysis pipeline described here is based on a reliable, flexible and fully automated workflow, useful to rapidly and efficiently analyze high-throughput smallRNA-Seq data, such as those produced by the most recent high-performance next generation sequencers. iMir is available at http://www.labmedmolge.unisa.it/inglese/research/imir. BioMed Central 2013-12-13 /pmc/articles/PMC3878829/ /pubmed/24330401 http://dx.doi.org/10.1186/1471-2105-14-362 Text en Copyright © 2013 Giurato et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Software
Giurato, Giorgio
De Filippo, Maria Rosaria
Rinaldi, Antonio
Hashim, Adnan
Nassa, Giovanni
Ravo, Maria
Rizzo, Francesca
Tarallo, Roberta
Weisz, Alessandro
iMir: An integrated pipeline for high-throughput analysis of small non-coding RNA data obtained by smallRNA-Seq
title iMir: An integrated pipeline for high-throughput analysis of small non-coding RNA data obtained by smallRNA-Seq
title_full iMir: An integrated pipeline for high-throughput analysis of small non-coding RNA data obtained by smallRNA-Seq
title_fullStr iMir: An integrated pipeline for high-throughput analysis of small non-coding RNA data obtained by smallRNA-Seq
title_full_unstemmed iMir: An integrated pipeline for high-throughput analysis of small non-coding RNA data obtained by smallRNA-Seq
title_short iMir: An integrated pipeline for high-throughput analysis of small non-coding RNA data obtained by smallRNA-Seq
title_sort imir: an integrated pipeline for high-throughput analysis of small non-coding rna data obtained by smallrna-seq
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3878829/
https://www.ncbi.nlm.nih.gov/pubmed/24330401
http://dx.doi.org/10.1186/1471-2105-14-362
work_keys_str_mv AT giuratogiorgio imiranintegratedpipelineforhighthroughputanalysisofsmallnoncodingrnadataobtainedbysmallrnaseq
AT defilippomariarosaria imiranintegratedpipelineforhighthroughputanalysisofsmallnoncodingrnadataobtainedbysmallrnaseq
AT rinaldiantonio imiranintegratedpipelineforhighthroughputanalysisofsmallnoncodingrnadataobtainedbysmallrnaseq
AT hashimadnan imiranintegratedpipelineforhighthroughputanalysisofsmallnoncodingrnadataobtainedbysmallrnaseq
AT nassagiovanni imiranintegratedpipelineforhighthroughputanalysisofsmallnoncodingrnadataobtainedbysmallrnaseq
AT ravomaria imiranintegratedpipelineforhighthroughputanalysisofsmallnoncodingrnadataobtainedbysmallrnaseq
AT rizzofrancesca imiranintegratedpipelineforhighthroughputanalysisofsmallnoncodingrnadataobtainedbysmallrnaseq
AT taralloroberta imiranintegratedpipelineforhighthroughputanalysisofsmallnoncodingrnadataobtainedbysmallrnaseq
AT weiszalessandro imiranintegratedpipelineforhighthroughputanalysisofsmallnoncodingrnadataobtainedbysmallrnaseq