Cargando…

Distinguishing mirtrons from canonical miRNAs with data exploration and machine learning methods

Mirtrons are non-canonical microRNAs encoded in introns the biogenesis of which starts with splicing. They are not processed by Drosha and enter the canonical pathway at the Exportin-5 level. Mirtrons are much less evolutionary conserved than canonical miRNAs. Due to the differences, canonical miRNA...

Descripción completa

Detalles Bibliográficos
Autores principales: Rorbach, Grzegorz, Unold, Olgierd, Konopka, Bogumil M.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5953923/
https://www.ncbi.nlm.nih.gov/pubmed/29765080
http://dx.doi.org/10.1038/s41598-018-25578-3
_version_ 1783323416061804544
author Rorbach, Grzegorz
Unold, Olgierd
Konopka, Bogumil M.
author_facet Rorbach, Grzegorz
Unold, Olgierd
Konopka, Bogumil M.
author_sort Rorbach, Grzegorz
collection PubMed
description Mirtrons are non-canonical microRNAs encoded in introns the biogenesis of which starts with splicing. They are not processed by Drosha and enter the canonical pathway at the Exportin-5 level. Mirtrons are much less evolutionary conserved than canonical miRNAs. Due to the differences, canonical miRNA predictors are not applicable to mirtron prediction. Identification of differences is important for designing mirtron prediction algorithms and may help to improve the understanding of mirtron functioning. So far, only simple, single-feature comparisons were reported. These are insensitive to complex feature relations. We quantified miRNAs with 25 features and showed that it is impossible to distinguish the two miRNA species using simple thresholds on any single feature. However, when using the Principal Component Analysis mirtrons and canonical miRNAs are grouped separately. Moreover, several methodologically diverse machine learning classifiers delivered high classification performance. Using feature selection algorithms we found features (e.g. bulges in the stem region), previously reported divergent in two classes, that did not contribute to improving classification accuracy, which suggests that they are not biologically meaningful. Finally, we proposed a combination of the most important features (including Guanine content, hairpin free energy and hairpin length) which convey a specific pattern, crucial for identifying mirtrons.
format Online
Article
Text
id pubmed-5953923
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-59539232018-05-21 Distinguishing mirtrons from canonical miRNAs with data exploration and machine learning methods Rorbach, Grzegorz Unold, Olgierd Konopka, Bogumil M. Sci Rep Article Mirtrons are non-canonical microRNAs encoded in introns the biogenesis of which starts with splicing. They are not processed by Drosha and enter the canonical pathway at the Exportin-5 level. Mirtrons are much less evolutionary conserved than canonical miRNAs. Due to the differences, canonical miRNA predictors are not applicable to mirtron prediction. Identification of differences is important for designing mirtron prediction algorithms and may help to improve the understanding of mirtron functioning. So far, only simple, single-feature comparisons were reported. These are insensitive to complex feature relations. We quantified miRNAs with 25 features and showed that it is impossible to distinguish the two miRNA species using simple thresholds on any single feature. However, when using the Principal Component Analysis mirtrons and canonical miRNAs are grouped separately. Moreover, several methodologically diverse machine learning classifiers delivered high classification performance. Using feature selection algorithms we found features (e.g. bulges in the stem region), previously reported divergent in two classes, that did not contribute to improving classification accuracy, which suggests that they are not biologically meaningful. Finally, we proposed a combination of the most important features (including Guanine content, hairpin free energy and hairpin length) which convey a specific pattern, crucial for identifying mirtrons. Nature Publishing Group UK 2018-05-15 /pmc/articles/PMC5953923/ /pubmed/29765080 http://dx.doi.org/10.1038/s41598-018-25578-3 Text en © The Author(s) 2018 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
spellingShingle Article
Rorbach, Grzegorz
Unold, Olgierd
Konopka, Bogumil M.
Distinguishing mirtrons from canonical miRNAs with data exploration and machine learning methods
title Distinguishing mirtrons from canonical miRNAs with data exploration and machine learning methods
title_full Distinguishing mirtrons from canonical miRNAs with data exploration and machine learning methods
title_fullStr Distinguishing mirtrons from canonical miRNAs with data exploration and machine learning methods
title_full_unstemmed Distinguishing mirtrons from canonical miRNAs with data exploration and machine learning methods
title_short Distinguishing mirtrons from canonical miRNAs with data exploration and machine learning methods
title_sort distinguishing mirtrons from canonical mirnas with data exploration and machine learning methods
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5953923/
https://www.ncbi.nlm.nih.gov/pubmed/29765080
http://dx.doi.org/10.1038/s41598-018-25578-3
work_keys_str_mv AT rorbachgrzegorz distinguishingmirtronsfromcanonicalmirnaswithdataexplorationandmachinelearningmethods
AT unoldolgierd distinguishingmirtronsfromcanonicalmirnaswithdataexplorationandmachinelearningmethods
AT konopkabogumilm distinguishingmirtronsfromcanonicalmirnaswithdataexplorationandmachinelearningmethods