Cargando…

Towards a piRNA prediction using multiple kernel fusion and support vector machine

Motivation: Piwi-interacting RNA (piRNA) is the most recently discovered and the least investigated class of Argonaute/Piwi protein-interacting small non-coding RNAs. The piRNAs are mostly known to be involved in protecting the genome from invasive transposable elements. But recent discoveries sugge...

Descripción completa

Detalles Bibliográficos
Autores principales: Brayet, Jocelyn, Zehraoui, Farida, Jeanson-Leh, Laurence, Israeli, David, Tahi, Fariza
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4147894/
https://www.ncbi.nlm.nih.gov/pubmed/25161221
http://dx.doi.org/10.1093/bioinformatics/btu441
_version_ 1782332531889143808
author Brayet, Jocelyn
Zehraoui, Farida
Jeanson-Leh, Laurence
Israeli, David
Tahi, Fariza
author_facet Brayet, Jocelyn
Zehraoui, Farida
Jeanson-Leh, Laurence
Israeli, David
Tahi, Fariza
author_sort Brayet, Jocelyn
collection PubMed
description Motivation: Piwi-interacting RNA (piRNA) is the most recently discovered and the least investigated class of Argonaute/Piwi protein-interacting small non-coding RNAs. The piRNAs are mostly known to be involved in protecting the genome from invasive transposable elements. But recent discoveries suggest their involvement in the pathophysiology of diseases, such as cancer. Their identification is therefore an important task, and computational methods are needed. However, the lack of conserved piRNA sequences and structural elements makes this identification challenging and difficult. Results: In the present study, we propose a new modular and extensible machine learning method based on multiple kernels and a support vector machine (SVM) classifier for piRNA identification. Very few piRNA features are known to date. The use of a multiple kernels approach allows editing, adding or removing piRNA features that can be heterogeneous in a modular manner according to their relevance in a given species. Our algorithm is based on a combination of the previously identified features [sequence features (k-mer motifs and a uridine at the first position) and piRNAs cluster feature] and a new telomere/centromere vicinity feature. These features are heterogeneous, and the kernels allow to unify their representation. The proposed algorithm, named piRPred, gives promising results on Drosophila and Human data and outscores previously published piRNA identification algorithms. Availability and implementation: piRPred is freely available to non-commercial users on our Web server EvryRNA http://EvryRNA.ibisc.univ-evry.fr Contact: tahi@ibisc.univ-evry.fr
format Online
Article
Text
id pubmed-4147894
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-41478942014-09-02 Towards a piRNA prediction using multiple kernel fusion and support vector machine Brayet, Jocelyn Zehraoui, Farida Jeanson-Leh, Laurence Israeli, David Tahi, Fariza Bioinformatics Eccb 2014 Proceedings Papers Committee Motivation: Piwi-interacting RNA (piRNA) is the most recently discovered and the least investigated class of Argonaute/Piwi protein-interacting small non-coding RNAs. The piRNAs are mostly known to be involved in protecting the genome from invasive transposable elements. But recent discoveries suggest their involvement in the pathophysiology of diseases, such as cancer. Their identification is therefore an important task, and computational methods are needed. However, the lack of conserved piRNA sequences and structural elements makes this identification challenging and difficult. Results: In the present study, we propose a new modular and extensible machine learning method based on multiple kernels and a support vector machine (SVM) classifier for piRNA identification. Very few piRNA features are known to date. The use of a multiple kernels approach allows editing, adding or removing piRNA features that can be heterogeneous in a modular manner according to their relevance in a given species. Our algorithm is based on a combination of the previously identified features [sequence features (k-mer motifs and a uridine at the first position) and piRNAs cluster feature] and a new telomere/centromere vicinity feature. These features are heterogeneous, and the kernels allow to unify their representation. The proposed algorithm, named piRPred, gives promising results on Drosophila and Human data and outscores previously published piRNA identification algorithms. Availability and implementation: piRPred is freely available to non-commercial users on our Web server EvryRNA http://EvryRNA.ibisc.univ-evry.fr Contact: tahi@ibisc.univ-evry.fr Oxford University Press 2014-09-01 2014-08-22 /pmc/articles/PMC4147894/ /pubmed/25161221 http://dx.doi.org/10.1093/bioinformatics/btu441 Text en © The Author 2014. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/4.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Eccb 2014 Proceedings Papers Committee
Brayet, Jocelyn
Zehraoui, Farida
Jeanson-Leh, Laurence
Israeli, David
Tahi, Fariza
Towards a piRNA prediction using multiple kernel fusion and support vector machine
title Towards a piRNA prediction using multiple kernel fusion and support vector machine
title_full Towards a piRNA prediction using multiple kernel fusion and support vector machine
title_fullStr Towards a piRNA prediction using multiple kernel fusion and support vector machine
title_full_unstemmed Towards a piRNA prediction using multiple kernel fusion and support vector machine
title_short Towards a piRNA prediction using multiple kernel fusion and support vector machine
title_sort towards a pirna prediction using multiple kernel fusion and support vector machine
topic Eccb 2014 Proceedings Papers Committee
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4147894/
https://www.ncbi.nlm.nih.gov/pubmed/25161221
http://dx.doi.org/10.1093/bioinformatics/btu441
work_keys_str_mv AT brayetjocelyn towardsapirnapredictionusingmultiplekernelfusionandsupportvectormachine
AT zehraouifarida towardsapirnapredictionusingmultiplekernelfusionandsupportvectormachine
AT jeansonlehlaurence towardsapirnapredictionusingmultiplekernelfusionandsupportvectormachine
AT israelidavid towardsapirnapredictionusingmultiplekernelfusionandsupportvectormachine
AT tahifariza towardsapirnapredictionusingmultiplekernelfusionandsupportvectormachine