Cargando…
Optimization of miRNA-seq data preprocessing
The past two decades of microRNA (miRNA) research has solidified the role of these small non-coding RNAs as key regulators of many biological processes and promising biomarkers for disease. The concurrent development in high-throughput profiling technology has further advanced our understanding of t...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2015
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4652620/ https://www.ncbi.nlm.nih.gov/pubmed/25888698 http://dx.doi.org/10.1093/bib/bbv019 |
_version_ | 1782401787800584192 |
---|---|
author | Tam, Shirley Tsao, Ming-Sound McPherson, John D. |
author_facet | Tam, Shirley Tsao, Ming-Sound McPherson, John D. |
author_sort | Tam, Shirley |
collection | PubMed |
description | The past two decades of microRNA (miRNA) research has solidified the role of these small non-coding RNAs as key regulators of many biological processes and promising biomarkers for disease. The concurrent development in high-throughput profiling technology has further advanced our understanding of the impact of their dysregulation on a global scale. Currently, next-generation sequencing is the platform of choice for the discovery and quantification of miRNAs. Despite this, there is no clear consensus on how the data should be preprocessed before conducting downstream analyses. Often overlooked, data preprocessing is an essential step in data analysis: the presence of unreliable features and noise can affect the conclusions drawn from downstream analyses. Using a spike-in dilution study, we evaluated the effects of several general-purpose aligners (BWA, Bowtie, Bowtie 2 and Novoalign), and normalization methods (counts-per-million, total count scaling, upper quartile scaling, Trimmed Mean of M, DESeq, linear regression, cyclic loess and quantile) with respect to the final miRNA count data distribution, variance, bias and accuracy of differential expression analysis. We make practical recommendations on the optimal preprocessing methods for the extraction and interpretation of miRNA count data from small RNA-sequencing experiments. |
format | Online Article Text |
id | pubmed-4652620 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2015 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-46526202015-11-25 Optimization of miRNA-seq data preprocessing Tam, Shirley Tsao, Ming-Sound McPherson, John D. Brief Bioinform Papers The past two decades of microRNA (miRNA) research has solidified the role of these small non-coding RNAs as key regulators of many biological processes and promising biomarkers for disease. The concurrent development in high-throughput profiling technology has further advanced our understanding of the impact of their dysregulation on a global scale. Currently, next-generation sequencing is the platform of choice for the discovery and quantification of miRNAs. Despite this, there is no clear consensus on how the data should be preprocessed before conducting downstream analyses. Often overlooked, data preprocessing is an essential step in data analysis: the presence of unreliable features and noise can affect the conclusions drawn from downstream analyses. Using a spike-in dilution study, we evaluated the effects of several general-purpose aligners (BWA, Bowtie, Bowtie 2 and Novoalign), and normalization methods (counts-per-million, total count scaling, upper quartile scaling, Trimmed Mean of M, DESeq, linear regression, cyclic loess and quantile) with respect to the final miRNA count data distribution, variance, bias and accuracy of differential expression analysis. We make practical recommendations on the optimal preprocessing methods for the extraction and interpretation of miRNA count data from small RNA-sequencing experiments. Oxford University Press 2015-11 2015-04-17 /pmc/articles/PMC4652620/ /pubmed/25888698 http://dx.doi.org/10.1093/bib/bbv019 Text en © The Author 2015. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Papers Tam, Shirley Tsao, Ming-Sound McPherson, John D. Optimization of miRNA-seq data preprocessing |
title | Optimization of miRNA-seq data preprocessing |
title_full | Optimization of miRNA-seq data preprocessing |
title_fullStr | Optimization of miRNA-seq data preprocessing |
title_full_unstemmed | Optimization of miRNA-seq data preprocessing |
title_short | Optimization of miRNA-seq data preprocessing |
title_sort | optimization of mirna-seq data preprocessing |
topic | Papers |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4652620/ https://www.ncbi.nlm.nih.gov/pubmed/25888698 http://dx.doi.org/10.1093/bib/bbv019 |
work_keys_str_mv | AT tamshirley optimizationofmirnaseqdatapreprocessing AT tsaomingsound optimizationofmirnaseqdatapreprocessing AT mcphersonjohnd optimizationofmirnaseqdatapreprocessing |