Cargando…

Effective computational detection of piRNAs using n-gram models and support vector machine

BACKGROUND: Piwi-interacting RNAs (piRNAs) are a new class of small non-coding RNAs that are known to be associated with RNA silencing. The piRNAs play an important role in protecting the genome from invasive transposons in the germline. Recent studies have shown that piRNAs are linked to the genome...

Descripción completa

Detalles Bibliográficos
Autores principales: Chen, Chun-Chi, Qian, Xiaoning, Yoon, Byung-Jun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5751586/
https://www.ncbi.nlm.nih.gov/pubmed/29297285
http://dx.doi.org/10.1186/s12859-017-1896-1
_version_ 1783289977964068864
author Chen, Chun-Chi
Qian, Xiaoning
Yoon, Byung-Jun
author_facet Chen, Chun-Chi
Qian, Xiaoning
Yoon, Byung-Jun
author_sort Chen, Chun-Chi
collection PubMed
description BACKGROUND: Piwi-interacting RNAs (piRNAs) are a new class of small non-coding RNAs that are known to be associated with RNA silencing. The piRNAs play an important role in protecting the genome from invasive transposons in the germline. Recent studies have shown that piRNAs are linked to the genome stability and a variety of human cancers. Due to their clinical importance, there is a pressing need for effective computational methods that can be used for computational identification of piRNAs. However, piRNAs lack conserved structural motifs and show relatively low sequence similarity across different species, which makes accurate computational prediction of piRNAs challenging. RESULTS: In this paper, we propose a novel method, piRNAdetect, for reliable computational prediction of piRNAs in genome sequences. In the proposed method, we first classify piRNA sequences in the training dataset that share similar sequence motifs and extract effective predictive features through the use of n-gram models (NGMs). The extracted NGM-based features are then used to construct a support vector machine that can be used for accurate prediction of novel piRNAs. CONCLUSIONS: We demonstrate the effectiveness of the proposed piRNAdetect algorithm through extensive performance evaluation based on piRNAs in three different species – H. sapiens, R. norvegicus, and M. musculus – obtained from the piRBase and show that piRNAdetect outperforms the current state-of-the-art methods in terms of efficiency and accuracy.
format Online
Article
Text
id pubmed-5751586
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-57515862018-01-05 Effective computational detection of piRNAs using n-gram models and support vector machine Chen, Chun-Chi Qian, Xiaoning Yoon, Byung-Jun BMC Bioinformatics Research BACKGROUND: Piwi-interacting RNAs (piRNAs) are a new class of small non-coding RNAs that are known to be associated with RNA silencing. The piRNAs play an important role in protecting the genome from invasive transposons in the germline. Recent studies have shown that piRNAs are linked to the genome stability and a variety of human cancers. Due to their clinical importance, there is a pressing need for effective computational methods that can be used for computational identification of piRNAs. However, piRNAs lack conserved structural motifs and show relatively low sequence similarity across different species, which makes accurate computational prediction of piRNAs challenging. RESULTS: In this paper, we propose a novel method, piRNAdetect, for reliable computational prediction of piRNAs in genome sequences. In the proposed method, we first classify piRNA sequences in the training dataset that share similar sequence motifs and extract effective predictive features through the use of n-gram models (NGMs). The extracted NGM-based features are then used to construct a support vector machine that can be used for accurate prediction of novel piRNAs. CONCLUSIONS: We demonstrate the effectiveness of the proposed piRNAdetect algorithm through extensive performance evaluation based on piRNAs in three different species – H. sapiens, R. norvegicus, and M. musculus – obtained from the piRBase and show that piRNAdetect outperforms the current state-of-the-art methods in terms of efficiency and accuracy. BioMed Central 2017-12-28 /pmc/articles/PMC5751586/ /pubmed/29297285 http://dx.doi.org/10.1186/s12859-017-1896-1 Text en © The Author(s) 2017 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Chen, Chun-Chi
Qian, Xiaoning
Yoon, Byung-Jun
Effective computational detection of piRNAs using n-gram models and support vector machine
title Effective computational detection of piRNAs using n-gram models and support vector machine
title_full Effective computational detection of piRNAs using n-gram models and support vector machine
title_fullStr Effective computational detection of piRNAs using n-gram models and support vector machine
title_full_unstemmed Effective computational detection of piRNAs using n-gram models and support vector machine
title_short Effective computational detection of piRNAs using n-gram models and support vector machine
title_sort effective computational detection of pirnas using n-gram models and support vector machine
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5751586/
https://www.ncbi.nlm.nih.gov/pubmed/29297285
http://dx.doi.org/10.1186/s12859-017-1896-1
work_keys_str_mv AT chenchunchi effectivecomputationaldetectionofpirnasusingngrammodelsandsupportvectormachine
AT qianxiaoning effectivecomputationaldetectionofpirnasusingngrammodelsandsupportvectormachine
AT yoonbyungjun effectivecomputationaldetectionofpirnasusingngrammodelsandsupportvectormachine