Cargando…

Transcriptome-Wide Annotation of m(5)C RNA Modifications Using Machine Learning

The emergence of epitranscriptome opened a new chapter in gene regulation. 5-methylcytosine (m(5)C), as an important post-transcriptional modification, has been identified to be involved in a variety of biological processes such as subcellular localization and translational fidelity. Though high-thr...

Descripción completa

Detalles Bibliográficos
Autores principales: Song, Jie, Zhai, Jingjing, Bian, Enze, Song, Yujia, Yu, Jiantao, Ma, Chuang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5915569/
https://www.ncbi.nlm.nih.gov/pubmed/29720995
http://dx.doi.org/10.3389/fpls.2018.00519
_version_ 1783316890988314624
author Song, Jie
Zhai, Jingjing
Bian, Enze
Song, Yujia
Yu, Jiantao
Ma, Chuang
author_facet Song, Jie
Zhai, Jingjing
Bian, Enze
Song, Yujia
Yu, Jiantao
Ma, Chuang
author_sort Song, Jie
collection PubMed
description The emergence of epitranscriptome opened a new chapter in gene regulation. 5-methylcytosine (m(5)C), as an important post-transcriptional modification, has been identified to be involved in a variety of biological processes such as subcellular localization and translational fidelity. Though high-throughput experimental technologies have been developed and applied to profile m(5)C modifications under certain conditions, transcriptome-wide studies of m(5)C modifications are still hindered by the dynamic nature of m(5)C and the lack of computational prediction methods. In this study, we introduced PEA-m5C, a machine learning-based m(5)C predictor trained with features extracted from the flanking sequence of m(5)C modifications. PEA-m5C yielded an average AUC (area under the receiver operating characteristic) of 0.939 in 10-fold cross-validation experiments based on known Arabidopsis m(5)C modifications. A rigorous independent testing showed that PEA-m5C (Accuracy [Acc] = 0.835, Matthews correlation coefficient [MCC] = 0.688) is remarkably superior to the recently developed m(5)C predictor iRNAm5C-PseDNC (Acc = 0.665, MCC = 0.332). PEA-m5C has been applied to predict candidate m(5)C modifications in annotated Arabidopsis transcripts. Further analysis of these m(5)C candidates showed that 4nt downstream of the translational start site is the most frequently methylated position. PEA-m5C is freely available to academic users at: https://github.com/cma2015/PEA-m5C.
format Online
Article
Text
id pubmed-5915569
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-59155692018-05-02 Transcriptome-Wide Annotation of m(5)C RNA Modifications Using Machine Learning Song, Jie Zhai, Jingjing Bian, Enze Song, Yujia Yu, Jiantao Ma, Chuang Front Plant Sci Plant Science The emergence of epitranscriptome opened a new chapter in gene regulation. 5-methylcytosine (m(5)C), as an important post-transcriptional modification, has been identified to be involved in a variety of biological processes such as subcellular localization and translational fidelity. Though high-throughput experimental technologies have been developed and applied to profile m(5)C modifications under certain conditions, transcriptome-wide studies of m(5)C modifications are still hindered by the dynamic nature of m(5)C and the lack of computational prediction methods. In this study, we introduced PEA-m5C, a machine learning-based m(5)C predictor trained with features extracted from the flanking sequence of m(5)C modifications. PEA-m5C yielded an average AUC (area under the receiver operating characteristic) of 0.939 in 10-fold cross-validation experiments based on known Arabidopsis m(5)C modifications. A rigorous independent testing showed that PEA-m5C (Accuracy [Acc] = 0.835, Matthews correlation coefficient [MCC] = 0.688) is remarkably superior to the recently developed m(5)C predictor iRNAm5C-PseDNC (Acc = 0.665, MCC = 0.332). PEA-m5C has been applied to predict candidate m(5)C modifications in annotated Arabidopsis transcripts. Further analysis of these m(5)C candidates showed that 4nt downstream of the translational start site is the most frequently methylated position. PEA-m5C is freely available to academic users at: https://github.com/cma2015/PEA-m5C. Frontiers Media S.A. 2018-04-18 /pmc/articles/PMC5915569/ /pubmed/29720995 http://dx.doi.org/10.3389/fpls.2018.00519 Text en Copyright © 2018 Song, Zhai, Bian, Song, Yu and Ma. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Plant Science
Song, Jie
Zhai, Jingjing
Bian, Enze
Song, Yujia
Yu, Jiantao
Ma, Chuang
Transcriptome-Wide Annotation of m(5)C RNA Modifications Using Machine Learning
title Transcriptome-Wide Annotation of m(5)C RNA Modifications Using Machine Learning
title_full Transcriptome-Wide Annotation of m(5)C RNA Modifications Using Machine Learning
title_fullStr Transcriptome-Wide Annotation of m(5)C RNA Modifications Using Machine Learning
title_full_unstemmed Transcriptome-Wide Annotation of m(5)C RNA Modifications Using Machine Learning
title_short Transcriptome-Wide Annotation of m(5)C RNA Modifications Using Machine Learning
title_sort transcriptome-wide annotation of m(5)c rna modifications using machine learning
topic Plant Science
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5915569/
https://www.ncbi.nlm.nih.gov/pubmed/29720995
http://dx.doi.org/10.3389/fpls.2018.00519
work_keys_str_mv AT songjie transcriptomewideannotationofm5crnamodificationsusingmachinelearning
AT zhaijingjing transcriptomewideannotationofm5crnamodificationsusingmachinelearning
AT bianenze transcriptomewideannotationofm5crnamodificationsusingmachinelearning
AT songyujia transcriptomewideannotationofm5crnamodificationsusingmachinelearning
AT yujiantao transcriptomewideannotationofm5crnamodificationsusingmachinelearning
AT machuang transcriptomewideannotationofm5crnamodificationsusingmachinelearning