Cargando…

MicroRNA categorization using sequence motifs and k-mers

BACKGROUND: Post-transcriptional gene dysregulation can be a hallmark of diseases like cancer and microRNAs (miRNAs) play a key role in the modulation of translation efficiency. Known pre-miRNAs are listed in miRBase, and they have been discovered in a variety of organisms ranging from viruses and m...

Descripción completa

Detalles Bibliográficos
Autores principales: Yousef, Malik, Khalifa, Waleed, Acar, İlhan Erkin, Allmer, Jens
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5351198/
https://www.ncbi.nlm.nih.gov/pubmed/28292266
http://dx.doi.org/10.1186/s12859-017-1584-1
_version_ 1782514729084780544
author Yousef, Malik
Khalifa, Waleed
Acar, İlhan Erkin
Allmer, Jens
author_facet Yousef, Malik
Khalifa, Waleed
Acar, İlhan Erkin
Allmer, Jens
author_sort Yousef, Malik
collection PubMed
description BACKGROUND: Post-transcriptional gene dysregulation can be a hallmark of diseases like cancer and microRNAs (miRNAs) play a key role in the modulation of translation efficiency. Known pre-miRNAs are listed in miRBase, and they have been discovered in a variety of organisms ranging from viruses and microbes to eukaryotic organisms. The computational detection of pre-miRNAs is of great interest, and such approaches usually employ machine learning to discriminate between miRNAs and other sequences. Many features have been proposed describing pre-miRNAs, and we have previously introduced the use of sequence motifs and k-mers as useful ones. There have been reports of xeno-miRNAs detected via next generation sequencing. However, they may be contaminations and to aid that important decision-making process, we aimed to establish a means to differentiate pre-miRNAs from different species. RESULTS: To achieve distinction into species, we used one species’ pre-miRNAs as the positive and another species’ pre-miRNAs as the negative training and test data for the establishment of machine learned models based on sequence motifs and k-mers as features. This approach resulted in higher accuracy values between distantly related species while species with closer relation produced lower accuracy values. CONCLUSIONS: We were able to differentiate among species with increasing success when the evolutionary distance increases. This conclusion is supported by previous reports of fast evolutionary changes in miRNAs since even in relatively closely related species a fairly good discrimination was possible. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-017-1584-1) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-5351198
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-53511982017-03-17 MicroRNA categorization using sequence motifs and k-mers Yousef, Malik Khalifa, Waleed Acar, İlhan Erkin Allmer, Jens BMC Bioinformatics Research Article BACKGROUND: Post-transcriptional gene dysregulation can be a hallmark of diseases like cancer and microRNAs (miRNAs) play a key role in the modulation of translation efficiency. Known pre-miRNAs are listed in miRBase, and they have been discovered in a variety of organisms ranging from viruses and microbes to eukaryotic organisms. The computational detection of pre-miRNAs is of great interest, and such approaches usually employ machine learning to discriminate between miRNAs and other sequences. Many features have been proposed describing pre-miRNAs, and we have previously introduced the use of sequence motifs and k-mers as useful ones. There have been reports of xeno-miRNAs detected via next generation sequencing. However, they may be contaminations and to aid that important decision-making process, we aimed to establish a means to differentiate pre-miRNAs from different species. RESULTS: To achieve distinction into species, we used one species’ pre-miRNAs as the positive and another species’ pre-miRNAs as the negative training and test data for the establishment of machine learned models based on sequence motifs and k-mers as features. This approach resulted in higher accuracy values between distantly related species while species with closer relation produced lower accuracy values. CONCLUSIONS: We were able to differentiate among species with increasing success when the evolutionary distance increases. This conclusion is supported by previous reports of fast evolutionary changes in miRNAs since even in relatively closely related species a fairly good discrimination was possible. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-017-1584-1) contains supplementary material, which is available to authorized users. BioMed Central 2017-03-14 /pmc/articles/PMC5351198/ /pubmed/28292266 http://dx.doi.org/10.1186/s12859-017-1584-1 Text en © The Author(s). 2017 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Yousef, Malik
Khalifa, Waleed
Acar, İlhan Erkin
Allmer, Jens
MicroRNA categorization using sequence motifs and k-mers
title MicroRNA categorization using sequence motifs and k-mers
title_full MicroRNA categorization using sequence motifs and k-mers
title_fullStr MicroRNA categorization using sequence motifs and k-mers
title_full_unstemmed MicroRNA categorization using sequence motifs and k-mers
title_short MicroRNA categorization using sequence motifs and k-mers
title_sort microrna categorization using sequence motifs and k-mers
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5351198/
https://www.ncbi.nlm.nih.gov/pubmed/28292266
http://dx.doi.org/10.1186/s12859-017-1584-1
work_keys_str_mv AT yousefmalik micrornacategorizationusingsequencemotifsandkmers
AT khalifawaleed micrornacategorizationusingsequencemotifsandkmers
AT acarilhanerkin micrornacategorizationusingsequencemotifsandkmers
AT allmerjens micrornacategorizationusingsequencemotifsandkmers