Cargando…

Classification of ncRNAs using position and size information in deep sequencing data

Motivation: Small non-coding RNAs (ncRNAs) play important roles in various cellular functions in all clades of life. With next-generation sequencing techniques, it has become possible to study ncRNAs in a high-throughput manner and by using specialized algorithms ncRNA classes such as miRNAs can be...

Descripción completa

Detalles Bibliográficos
Autores principales: Erhard, Florian, Zimmer, Ralf
Formato: Texto
Lenguaje:English
Publicado: Oxford University Press 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2935403/
https://www.ncbi.nlm.nih.gov/pubmed/20823303
http://dx.doi.org/10.1093/bioinformatics/btq363
_version_ 1782186395051229184
author Erhard, Florian
Zimmer, Ralf
author_facet Erhard, Florian
Zimmer, Ralf
author_sort Erhard, Florian
collection PubMed
description Motivation: Small non-coding RNAs (ncRNAs) play important roles in various cellular functions in all clades of life. With next-generation sequencing techniques, it has become possible to study ncRNAs in a high-throughput manner and by using specialized algorithms ncRNA classes such as miRNAs can be detected in deep sequencing data. Typically, such methods are targeted to a certain class of ncRNA. Many methods rely on RNA secondary structure prediction, which is not always accurate and not all ncRNA classes are characterized by a common secondary structure. Unbiased classification methods for ncRNAs could be important to improve accuracy and to detect new ncRNA classes in sequencing data. Results: Here, we present a scoring system called ALPS (alignment of pattern matrices score) that only uses primary information from a deep sequencing experiment, i.e. the relative positions and lengths of reads, to classify ncRNAs. ALPS makes no further assumptions, e.g. about common structural properties in the ncRNA class and is nevertheless able to identify ncRNA classes with high accuracy. Since ALPS is not designed to recognize a certain class of ncRNA, it can be used to detect novel ncRNA classes, as long as these unknown ncRNAs have a characteristic pattern of deep sequencing read lengths and positions. We evaluate our scoring system on publicly available deep sequencing data and show that it is able to classify known ncRNAs with high sensitivity and specificity. Availability: Calculated pattern matrices of the datasets hESC and EB are available at the project web site http://www.bio.ifi.lmu.de/ALPS. An implementation of the described method is available upon request from the authors. Contact: florian.erhard@bio.ifi.lmu.de
format Text
id pubmed-2935403
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-29354032010-09-08 Classification of ncRNAs using position and size information in deep sequencing data Erhard, Florian Zimmer, Ralf Bioinformatics Eccb 2010 Conference Proceedings September 26 to September 29, 2010, Ghent, Belgium Motivation: Small non-coding RNAs (ncRNAs) play important roles in various cellular functions in all clades of life. With next-generation sequencing techniques, it has become possible to study ncRNAs in a high-throughput manner and by using specialized algorithms ncRNA classes such as miRNAs can be detected in deep sequencing data. Typically, such methods are targeted to a certain class of ncRNA. Many methods rely on RNA secondary structure prediction, which is not always accurate and not all ncRNA classes are characterized by a common secondary structure. Unbiased classification methods for ncRNAs could be important to improve accuracy and to detect new ncRNA classes in sequencing data. Results: Here, we present a scoring system called ALPS (alignment of pattern matrices score) that only uses primary information from a deep sequencing experiment, i.e. the relative positions and lengths of reads, to classify ncRNAs. ALPS makes no further assumptions, e.g. about common structural properties in the ncRNA class and is nevertheless able to identify ncRNA classes with high accuracy. Since ALPS is not designed to recognize a certain class of ncRNA, it can be used to detect novel ncRNA classes, as long as these unknown ncRNAs have a characteristic pattern of deep sequencing read lengths and positions. We evaluate our scoring system on publicly available deep sequencing data and show that it is able to classify known ncRNAs with high sensitivity and specificity. Availability: Calculated pattern matrices of the datasets hESC and EB are available at the project web site http://www.bio.ifi.lmu.de/ALPS. An implementation of the described method is available upon request from the authors. Contact: florian.erhard@bio.ifi.lmu.de Oxford University Press 2010-09-15 2010-09-04 /pmc/articles/PMC2935403/ /pubmed/20823303 http://dx.doi.org/10.1093/bioinformatics/btq363 Text en © The Author(s) 2010. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/2.0/uk/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.5), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Eccb 2010 Conference Proceedings September 26 to September 29, 2010, Ghent, Belgium
Erhard, Florian
Zimmer, Ralf
Classification of ncRNAs using position and size information in deep sequencing data
title Classification of ncRNAs using position and size information in deep sequencing data
title_full Classification of ncRNAs using position and size information in deep sequencing data
title_fullStr Classification of ncRNAs using position and size information in deep sequencing data
title_full_unstemmed Classification of ncRNAs using position and size information in deep sequencing data
title_short Classification of ncRNAs using position and size information in deep sequencing data
title_sort classification of ncrnas using position and size information in deep sequencing data
topic Eccb 2010 Conference Proceedings September 26 to September 29, 2010, Ghent, Belgium
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2935403/
https://www.ncbi.nlm.nih.gov/pubmed/20823303
http://dx.doi.org/10.1093/bioinformatics/btq363
work_keys_str_mv AT erhardflorian classificationofncrnasusingpositionandsizeinformationindeepsequencingdata
AT zimmerralf classificationofncrnasusingpositionandsizeinformationindeepsequencingdata