Cargando…

A k-mer scheme to predict piRNAs and characterize locust piRNAs

Motivation: Identifying piwi-interacting RNAs (piRNAs) of non-model organisms is a difficult and unsolved problem because piRNAs lack conservative secondary structure motifs and sequence homology in different species. Results: In this article, a k-mer scheme is proposed to identify piRNA sequences,...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Yi, Wang, Xianhui, Kang, Le
Formato: Texto
Lenguaje:English
Publicado: Oxford University Press 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3051322/
https://www.ncbi.nlm.nih.gov/pubmed/21224287
http://dx.doi.org/10.1093/bioinformatics/btr016
_version_ 1782199470410170368
author Zhang, Yi
Wang, Xianhui
Kang, Le
author_facet Zhang, Yi
Wang, Xianhui
Kang, Le
author_sort Zhang, Yi
collection PubMed
description Motivation: Identifying piwi-interacting RNAs (piRNAs) of non-model organisms is a difficult and unsolved problem because piRNAs lack conservative secondary structure motifs and sequence homology in different species. Results: In this article, a k-mer scheme is proposed to identify piRNA sequences, relying on the training sets from non-piRNA and piRNA sequences of five model species sequenced: rat, mouse, human, fruit fly and nematode. Compared with the existing ‘static’ scheme based on the position-specific base usage, our novel ‘dynamic’ algorithm performs much better with a precision of over 90% and a sensitivity of over 60%, and the precision is verified by 5-fold cross-validation in these species. To test its validity, we use the algorithm to identify piRNAs of the migratory locust based on 603 607 deep-sequenced small RNA sequences. Totally, 87 536 piRNAs of the locust are predicted, and 4426 of them matched with existing locust transposons. The transcriptional difference between solitary and gregarious locusts was described. We also revisit the position-specific base usage of piRNAs and find the conservation in the end of piRNAs. Therefore, the method we developed can be used to identify piRNAs of non-model organisms without complete genome sequences. Availability: The web server for implementing the algorithm and the software code are freely available to the academic community at http://59.79.168.90/piRNA/index.php. Contact: lkang@ioz.ac.cn Supplementary information: Supplementary data are available at Bioinformatics online.
format Text
id pubmed-3051322
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-30513222011-03-10 A k-mer scheme to predict piRNAs and characterize locust piRNAs Zhang, Yi Wang, Xianhui Kang, Le Bioinformatics Original Papers Motivation: Identifying piwi-interacting RNAs (piRNAs) of non-model organisms is a difficult and unsolved problem because piRNAs lack conservative secondary structure motifs and sequence homology in different species. Results: In this article, a k-mer scheme is proposed to identify piRNA sequences, relying on the training sets from non-piRNA and piRNA sequences of five model species sequenced: rat, mouse, human, fruit fly and nematode. Compared with the existing ‘static’ scheme based on the position-specific base usage, our novel ‘dynamic’ algorithm performs much better with a precision of over 90% and a sensitivity of over 60%, and the precision is verified by 5-fold cross-validation in these species. To test its validity, we use the algorithm to identify piRNAs of the migratory locust based on 603 607 deep-sequenced small RNA sequences. Totally, 87 536 piRNAs of the locust are predicted, and 4426 of them matched with existing locust transposons. The transcriptional difference between solitary and gregarious locusts was described. We also revisit the position-specific base usage of piRNAs and find the conservation in the end of piRNAs. Therefore, the method we developed can be used to identify piRNAs of non-model organisms without complete genome sequences. Availability: The web server for implementing the algorithm and the software code are freely available to the academic community at http://59.79.168.90/piRNA/index.php. Contact: lkang@ioz.ac.cn Supplementary information: Supplementary data are available at Bioinformatics online. Oxford University Press 2011-03-15 2011-01-11 /pmc/articles/PMC3051322/ /pubmed/21224287 http://dx.doi.org/10.1093/bioinformatics/btr016 Text en © The Author(s) 2011. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/2.0/uk/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.5), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Papers
Zhang, Yi
Wang, Xianhui
Kang, Le
A k-mer scheme to predict piRNAs and characterize locust piRNAs
title A k-mer scheme to predict piRNAs and characterize locust piRNAs
title_full A k-mer scheme to predict piRNAs and characterize locust piRNAs
title_fullStr A k-mer scheme to predict piRNAs and characterize locust piRNAs
title_full_unstemmed A k-mer scheme to predict piRNAs and characterize locust piRNAs
title_short A k-mer scheme to predict piRNAs and characterize locust piRNAs
title_sort k-mer scheme to predict pirnas and characterize locust pirnas
topic Original Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3051322/
https://www.ncbi.nlm.nih.gov/pubmed/21224287
http://dx.doi.org/10.1093/bioinformatics/btr016
work_keys_str_mv AT zhangyi akmerschemetopredictpirnasandcharacterizelocustpirnas
AT wangxianhui akmerschemetopredictpirnasandcharacterizelocustpirnas
AT kangle akmerschemetopredictpirnasandcharacterizelocustpirnas
AT zhangyi kmerschemetopredictpirnasandcharacterizelocustpirnas
AT wangxianhui kmerschemetopredictpirnasandcharacterizelocustpirnas
AT kangle kmerschemetopredictpirnasandcharacterizelocustpirnas