Cargando…

Optimization of miRNA-seq data preprocessing

The past two decades of microRNA (miRNA) research has solidified the role of these small non-coding RNAs as key regulators of many biological processes and promising biomarkers for disease. The concurrent development in high-throughput profiling technology has further advanced our understanding of t...

Descripción completa

Detalles Bibliográficos
Autores principales: Tam, Shirley, Tsao, Ming-Sound, McPherson, John D.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4652620/
https://www.ncbi.nlm.nih.gov/pubmed/25888698
http://dx.doi.org/10.1093/bib/bbv019
_version_ 1782401787800584192
author Tam, Shirley
Tsao, Ming-Sound
McPherson, John D.
author_facet Tam, Shirley
Tsao, Ming-Sound
McPherson, John D.
author_sort Tam, Shirley
collection PubMed
description The past two decades of microRNA (miRNA) research has solidified the role of these small non-coding RNAs as key regulators of many biological processes and promising biomarkers for disease. The concurrent development in high-throughput profiling technology has further advanced our understanding of the impact of their dysregulation on a global scale. Currently, next-generation sequencing is the platform of choice for the discovery and quantification of miRNAs. Despite this, there is no clear consensus on how the data should be preprocessed before conducting downstream analyses. Often overlooked, data preprocessing is an essential step in data analysis: the presence of unreliable features and noise can affect the conclusions drawn from downstream analyses. Using a spike-in dilution study, we evaluated the effects of several general-purpose aligners (BWA, Bowtie, Bowtie 2 and Novoalign), and normalization methods (counts-per-million, total count scaling, upper quartile scaling, Trimmed Mean of M, DESeq, linear regression, cyclic loess and quantile) with respect to the final miRNA count data distribution, variance, bias and accuracy of differential expression analysis. We make practical recommendations on the optimal preprocessing methods for the extraction and interpretation of miRNA count data from small RNA-sequencing experiments.
format Online
Article
Text
id pubmed-4652620
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-46526202015-11-25 Optimization of miRNA-seq data preprocessing Tam, Shirley Tsao, Ming-Sound McPherson, John D. Brief Bioinform Papers The past two decades of microRNA (miRNA) research has solidified the role of these small non-coding RNAs as key regulators of many biological processes and promising biomarkers for disease. The concurrent development in high-throughput profiling technology has further advanced our understanding of the impact of their dysregulation on a global scale. Currently, next-generation sequencing is the platform of choice for the discovery and quantification of miRNAs. Despite this, there is no clear consensus on how the data should be preprocessed before conducting downstream analyses. Often overlooked, data preprocessing is an essential step in data analysis: the presence of unreliable features and noise can affect the conclusions drawn from downstream analyses. Using a spike-in dilution study, we evaluated the effects of several general-purpose aligners (BWA, Bowtie, Bowtie 2 and Novoalign), and normalization methods (counts-per-million, total count scaling, upper quartile scaling, Trimmed Mean of M, DESeq, linear regression, cyclic loess and quantile) with respect to the final miRNA count data distribution, variance, bias and accuracy of differential expression analysis. We make practical recommendations on the optimal preprocessing methods for the extraction and interpretation of miRNA count data from small RNA-sequencing experiments. Oxford University Press 2015-11 2015-04-17 /pmc/articles/PMC4652620/ /pubmed/25888698 http://dx.doi.org/10.1093/bib/bbv019 Text en © The Author 2015. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Papers
Tam, Shirley
Tsao, Ming-Sound
McPherson, John D.
Optimization of miRNA-seq data preprocessing
title Optimization of miRNA-seq data preprocessing
title_full Optimization of miRNA-seq data preprocessing
title_fullStr Optimization of miRNA-seq data preprocessing
title_full_unstemmed Optimization of miRNA-seq data preprocessing
title_short Optimization of miRNA-seq data preprocessing
title_sort optimization of mirna-seq data preprocessing
topic Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4652620/
https://www.ncbi.nlm.nih.gov/pubmed/25888698
http://dx.doi.org/10.1093/bib/bbv019
work_keys_str_mv AT tamshirley optimizationofmirnaseqdatapreprocessing
AT tsaomingsound optimizationofmirnaseqdatapreprocessing
AT mcphersonjohnd optimizationofmirnaseqdatapreprocessing