Cargando…

Detection of microRNAs in color space

Motivation: Deep sequencing provides inexpensive opportunities to characterize the transcriptional diversity of known genomes. The AB SOLiD technology generates millions of short sequencing reads in color-space; that is, the raw data is a sequence of colors, where each color represents 2 nt and each...

Descripción completa

Detalles Bibliográficos
Autores principales: Marco, Antonio, Griffiths-Jones, Sam
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3268249/
https://www.ncbi.nlm.nih.gov/pubmed/22171334
http://dx.doi.org/10.1093/bioinformatics/btr686
_version_ 1782222368001753088
author Marco, Antonio
Griffiths-Jones, Sam
author_facet Marco, Antonio
Griffiths-Jones, Sam
author_sort Marco, Antonio
collection PubMed
description Motivation: Deep sequencing provides inexpensive opportunities to characterize the transcriptional diversity of known genomes. The AB SOLiD technology generates millions of short sequencing reads in color-space; that is, the raw data is a sequence of colors, where each color represents 2 nt and each nucleotide is represented by two consecutive colors. This strategy is purported to have several advantages, including increased ability to distinguish sequencing errors from polymorphisms. Several programs have been developed to map short reads to genomes in color space. However, a number of previously unexplored technical issues arise when using SOLiD technology to characterize microRNAs. Results: Here we explore these technical difficulties. First, since the sequenced reads are longer than the biological sequences, every read is expected to contain linker fragments. The color-calling error rate increases toward the 3(′) end of the read such that recognizing the linker sequence for removal becomes problematic. Second, mapping in color space may lead to the loss of the first nucleotide of each read. We propose a sequential trimming and mapping approach to map small RNAs. Using our strategy, we reanalyze three published insect small RNA deep sequencing datasets and characterize 22 new microRNAs. Availability and implementation: A bash shell script to perform the sequential trimming and mapping procedure, called SeqTrimMap, is available at: http://www.mirbase.org/tools/seqtrimmap/ Contact: antonio.marco@manchester.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-3268249
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-32682492012-01-30 Detection of microRNAs in color space Marco, Antonio Griffiths-Jones, Sam Bioinformatics Original Papers Motivation: Deep sequencing provides inexpensive opportunities to characterize the transcriptional diversity of known genomes. The AB SOLiD technology generates millions of short sequencing reads in color-space; that is, the raw data is a sequence of colors, where each color represents 2 nt and each nucleotide is represented by two consecutive colors. This strategy is purported to have several advantages, including increased ability to distinguish sequencing errors from polymorphisms. Several programs have been developed to map short reads to genomes in color space. However, a number of previously unexplored technical issues arise when using SOLiD technology to characterize microRNAs. Results: Here we explore these technical difficulties. First, since the sequenced reads are longer than the biological sequences, every read is expected to contain linker fragments. The color-calling error rate increases toward the 3(′) end of the read such that recognizing the linker sequence for removal becomes problematic. Second, mapping in color space may lead to the loss of the first nucleotide of each read. We propose a sequential trimming and mapping approach to map small RNAs. Using our strategy, we reanalyze three published insect small RNA deep sequencing datasets and characterize 22 new microRNAs. Availability and implementation: A bash shell script to perform the sequential trimming and mapping procedure, called SeqTrimMap, is available at: http://www.mirbase.org/tools/seqtrimmap/ Contact: antonio.marco@manchester.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online. Oxford University Press 2012-02-01 2011-12-09 /pmc/articles/PMC3268249/ /pubmed/22171334 http://dx.doi.org/10.1093/bioinformatics/btr686 Text en © The Author(s) 2011. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/3.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Papers
Marco, Antonio
Griffiths-Jones, Sam
Detection of microRNAs in color space
title Detection of microRNAs in color space
title_full Detection of microRNAs in color space
title_fullStr Detection of microRNAs in color space
title_full_unstemmed Detection of microRNAs in color space
title_short Detection of microRNAs in color space
title_sort detection of micrornas in color space
topic Original Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3268249/
https://www.ncbi.nlm.nih.gov/pubmed/22171334
http://dx.doi.org/10.1093/bioinformatics/btr686
work_keys_str_mv AT marcoantonio detectionofmicrornasincolorspace
AT griffithsjonessam detectionofmicrornasincolorspace