Cargando…

M6AMRFS: Robust Prediction of N6-Methyladenosine Sites With Sequence-Based Features in Multiple Species

As one of the well-studied RNA methylation modifications, N6-methyladenosine (m(6)A) plays important roles in various biological progresses, such as RNA splicing and degradation, etc. Identification of m(6)A sites is fundamentally important for better understanding of their functional mechanisms. Re...

Descripción completa

Detalles Bibliográficos
Autores principales: Qiang, Xiaoli, Chen, Huangrong, Ye, Xiucai, Su, Ran, Wei, Leyi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6209681/
https://www.ncbi.nlm.nih.gov/pubmed/30410501
http://dx.doi.org/10.3389/fgene.2018.00495
_version_ 1783366945972682752
author Qiang, Xiaoli
Chen, Huangrong
Ye, Xiucai
Su, Ran
Wei, Leyi
author_facet Qiang, Xiaoli
Chen, Huangrong
Ye, Xiucai
Su, Ran
Wei, Leyi
author_sort Qiang, Xiaoli
collection PubMed
description As one of the well-studied RNA methylation modifications, N6-methyladenosine (m(6)A) plays important roles in various biological progresses, such as RNA splicing and degradation, etc. Identification of m(6)A sites is fundamentally important for better understanding of their functional mechanisms. Recently, machine learning based prediction methods have emerged as an effective approach for fast and accurate identification of m(6)A sites. In this paper, we proposed “M6AMRFS”, a new machine learning based predictor for the identification of m(6)A sites. In this predictor, we exploited a new feature representation algorithm to encode RNA sequences with two feature descriptors (dinucleotide binary encoding and Local position-specific dinucleotide frequency), and used the F-score algorithm combined with SFS (Sequential Forward Search) to enhance the feature representation ability. To predict m(6)A sites, we employed the eXtreme Gradient Boosting (XGBoost) algorithm to build a predictive model. Benchmarking results showed that the proposed predictor is competitive with the state-of-the art predictors. Importantly, robust predictions for multiple species by our predictor demonstrate that our predictive models have strong generalization ability. To the best of our knowledge, M6AMRFS is the first tool that can be used for the identification of m(6)A sites in multiple species. To facilitate the use of our predictor, we have established a user-friendly webserver with the implementation of M6AMRFS, which is currently available in http://server.malab.cn/M6AMRFS/. We anticipate that it will be a useful tool for the relevant research of m(6)A sites.
format Online
Article
Text
id pubmed-6209681
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-62096812018-11-08 M6AMRFS: Robust Prediction of N6-Methyladenosine Sites With Sequence-Based Features in Multiple Species Qiang, Xiaoli Chen, Huangrong Ye, Xiucai Su, Ran Wei, Leyi Front Genet Genetics As one of the well-studied RNA methylation modifications, N6-methyladenosine (m(6)A) plays important roles in various biological progresses, such as RNA splicing and degradation, etc. Identification of m(6)A sites is fundamentally important for better understanding of their functional mechanisms. Recently, machine learning based prediction methods have emerged as an effective approach for fast and accurate identification of m(6)A sites. In this paper, we proposed “M6AMRFS”, a new machine learning based predictor for the identification of m(6)A sites. In this predictor, we exploited a new feature representation algorithm to encode RNA sequences with two feature descriptors (dinucleotide binary encoding and Local position-specific dinucleotide frequency), and used the F-score algorithm combined with SFS (Sequential Forward Search) to enhance the feature representation ability. To predict m(6)A sites, we employed the eXtreme Gradient Boosting (XGBoost) algorithm to build a predictive model. Benchmarking results showed that the proposed predictor is competitive with the state-of-the art predictors. Importantly, robust predictions for multiple species by our predictor demonstrate that our predictive models have strong generalization ability. To the best of our knowledge, M6AMRFS is the first tool that can be used for the identification of m(6)A sites in multiple species. To facilitate the use of our predictor, we have established a user-friendly webserver with the implementation of M6AMRFS, which is currently available in http://server.malab.cn/M6AMRFS/. We anticipate that it will be a useful tool for the relevant research of m(6)A sites. Frontiers Media S.A. 2018-10-25 /pmc/articles/PMC6209681/ /pubmed/30410501 http://dx.doi.org/10.3389/fgene.2018.00495 Text en Copyright © 2018 Qiang, Chen, Ye, Su and Wei. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Qiang, Xiaoli
Chen, Huangrong
Ye, Xiucai
Su, Ran
Wei, Leyi
M6AMRFS: Robust Prediction of N6-Methyladenosine Sites With Sequence-Based Features in Multiple Species
title M6AMRFS: Robust Prediction of N6-Methyladenosine Sites With Sequence-Based Features in Multiple Species
title_full M6AMRFS: Robust Prediction of N6-Methyladenosine Sites With Sequence-Based Features in Multiple Species
title_fullStr M6AMRFS: Robust Prediction of N6-Methyladenosine Sites With Sequence-Based Features in Multiple Species
title_full_unstemmed M6AMRFS: Robust Prediction of N6-Methyladenosine Sites With Sequence-Based Features in Multiple Species
title_short M6AMRFS: Robust Prediction of N6-Methyladenosine Sites With Sequence-Based Features in Multiple Species
title_sort m6amrfs: robust prediction of n6-methyladenosine sites with sequence-based features in multiple species
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6209681/
https://www.ncbi.nlm.nih.gov/pubmed/30410501
http://dx.doi.org/10.3389/fgene.2018.00495
work_keys_str_mv AT qiangxiaoli m6amrfsrobustpredictionofn6methyladenosinesiteswithsequencebasedfeaturesinmultiplespecies
AT chenhuangrong m6amrfsrobustpredictionofn6methyladenosinesiteswithsequencebasedfeaturesinmultiplespecies
AT yexiucai m6amrfsrobustpredictionofn6methyladenosinesiteswithsequencebasedfeaturesinmultiplespecies
AT suran m6amrfsrobustpredictionofn6methyladenosinesiteswithsequencebasedfeaturesinmultiplespecies
AT weileyi m6amrfsrobustpredictionofn6methyladenosinesiteswithsequencebasedfeaturesinmultiplespecies