Cargando…
Towards a piRNA prediction using multiple kernel fusion and support vector machine
Motivation: Piwi-interacting RNA (piRNA) is the most recently discovered and the least investigated class of Argonaute/Piwi protein-interacting small non-coding RNAs. The piRNAs are mostly known to be involved in protecting the genome from invasive transposable elements. But recent discoveries sugge...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4147894/ https://www.ncbi.nlm.nih.gov/pubmed/25161221 http://dx.doi.org/10.1093/bioinformatics/btu441 |
_version_ | 1782332531889143808 |
---|---|
author | Brayet, Jocelyn Zehraoui, Farida Jeanson-Leh, Laurence Israeli, David Tahi, Fariza |
author_facet | Brayet, Jocelyn Zehraoui, Farida Jeanson-Leh, Laurence Israeli, David Tahi, Fariza |
author_sort | Brayet, Jocelyn |
collection | PubMed |
description | Motivation: Piwi-interacting RNA (piRNA) is the most recently discovered and the least investigated class of Argonaute/Piwi protein-interacting small non-coding RNAs. The piRNAs are mostly known to be involved in protecting the genome from invasive transposable elements. But recent discoveries suggest their involvement in the pathophysiology of diseases, such as cancer. Their identification is therefore an important task, and computational methods are needed. However, the lack of conserved piRNA sequences and structural elements makes this identification challenging and difficult. Results: In the present study, we propose a new modular and extensible machine learning method based on multiple kernels and a support vector machine (SVM) classifier for piRNA identification. Very few piRNA features are known to date. The use of a multiple kernels approach allows editing, adding or removing piRNA features that can be heterogeneous in a modular manner according to their relevance in a given species. Our algorithm is based on a combination of the previously identified features [sequence features (k-mer motifs and a uridine at the first position) and piRNAs cluster feature] and a new telomere/centromere vicinity feature. These features are heterogeneous, and the kernels allow to unify their representation. The proposed algorithm, named piRPred, gives promising results on Drosophila and Human data and outscores previously published piRNA identification algorithms. Availability and implementation: piRPred is freely available to non-commercial users on our Web server EvryRNA http://EvryRNA.ibisc.univ-evry.fr Contact: tahi@ibisc.univ-evry.fr |
format | Online Article Text |
id | pubmed-4147894 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-41478942014-09-02 Towards a piRNA prediction using multiple kernel fusion and support vector machine Brayet, Jocelyn Zehraoui, Farida Jeanson-Leh, Laurence Israeli, David Tahi, Fariza Bioinformatics Eccb 2014 Proceedings Papers Committee Motivation: Piwi-interacting RNA (piRNA) is the most recently discovered and the least investigated class of Argonaute/Piwi protein-interacting small non-coding RNAs. The piRNAs are mostly known to be involved in protecting the genome from invasive transposable elements. But recent discoveries suggest their involvement in the pathophysiology of diseases, such as cancer. Their identification is therefore an important task, and computational methods are needed. However, the lack of conserved piRNA sequences and structural elements makes this identification challenging and difficult. Results: In the present study, we propose a new modular and extensible machine learning method based on multiple kernels and a support vector machine (SVM) classifier for piRNA identification. Very few piRNA features are known to date. The use of a multiple kernels approach allows editing, adding or removing piRNA features that can be heterogeneous in a modular manner according to their relevance in a given species. Our algorithm is based on a combination of the previously identified features [sequence features (k-mer motifs and a uridine at the first position) and piRNAs cluster feature] and a new telomere/centromere vicinity feature. These features are heterogeneous, and the kernels allow to unify their representation. The proposed algorithm, named piRPred, gives promising results on Drosophila and Human data and outscores previously published piRNA identification algorithms. Availability and implementation: piRPred is freely available to non-commercial users on our Web server EvryRNA http://EvryRNA.ibisc.univ-evry.fr Contact: tahi@ibisc.univ-evry.fr Oxford University Press 2014-09-01 2014-08-22 /pmc/articles/PMC4147894/ /pubmed/25161221 http://dx.doi.org/10.1093/bioinformatics/btu441 Text en © The Author 2014. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/4.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Eccb 2014 Proceedings Papers Committee Brayet, Jocelyn Zehraoui, Farida Jeanson-Leh, Laurence Israeli, David Tahi, Fariza Towards a piRNA prediction using multiple kernel fusion and support vector machine |
title | Towards a piRNA prediction using multiple kernel fusion and support vector machine |
title_full | Towards a piRNA prediction using multiple kernel fusion and support vector machine |
title_fullStr | Towards a piRNA prediction using multiple kernel fusion and support vector machine |
title_full_unstemmed | Towards a piRNA prediction using multiple kernel fusion and support vector machine |
title_short | Towards a piRNA prediction using multiple kernel fusion and support vector machine |
title_sort | towards a pirna prediction using multiple kernel fusion and support vector machine |
topic | Eccb 2014 Proceedings Papers Committee |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4147894/ https://www.ncbi.nlm.nih.gov/pubmed/25161221 http://dx.doi.org/10.1093/bioinformatics/btu441 |
work_keys_str_mv | AT brayetjocelyn towardsapirnapredictionusingmultiplekernelfusionandsupportvectormachine AT zehraouifarida towardsapirnapredictionusingmultiplekernelfusionandsupportvectormachine AT jeansonlehlaurence towardsapirnapredictionusingmultiplekernelfusionandsupportvectormachine AT israelidavid towardsapirnapredictionusingmultiplekernelfusionandsupportvectormachine AT tahifariza towardsapirnapredictionusingmultiplekernelfusionandsupportvectormachine |