Cargando…
M6AMRFS: Robust Prediction of N6-Methyladenosine Sites With Sequence-Based Features in Multiple Species
As one of the well-studied RNA methylation modifications, N6-methyladenosine (m(6)A) plays important roles in various biological progresses, such as RNA splicing and degradation, etc. Identification of m(6)A sites is fundamentally important for better understanding of their functional mechanisms. Re...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6209681/ https://www.ncbi.nlm.nih.gov/pubmed/30410501 http://dx.doi.org/10.3389/fgene.2018.00495 |
_version_ | 1783366945972682752 |
---|---|
author | Qiang, Xiaoli Chen, Huangrong Ye, Xiucai Su, Ran Wei, Leyi |
author_facet | Qiang, Xiaoli Chen, Huangrong Ye, Xiucai Su, Ran Wei, Leyi |
author_sort | Qiang, Xiaoli |
collection | PubMed |
description | As one of the well-studied RNA methylation modifications, N6-methyladenosine (m(6)A) plays important roles in various biological progresses, such as RNA splicing and degradation, etc. Identification of m(6)A sites is fundamentally important for better understanding of their functional mechanisms. Recently, machine learning based prediction methods have emerged as an effective approach for fast and accurate identification of m(6)A sites. In this paper, we proposed “M6AMRFS”, a new machine learning based predictor for the identification of m(6)A sites. In this predictor, we exploited a new feature representation algorithm to encode RNA sequences with two feature descriptors (dinucleotide binary encoding and Local position-specific dinucleotide frequency), and used the F-score algorithm combined with SFS (Sequential Forward Search) to enhance the feature representation ability. To predict m(6)A sites, we employed the eXtreme Gradient Boosting (XGBoost) algorithm to build a predictive model. Benchmarking results showed that the proposed predictor is competitive with the state-of-the art predictors. Importantly, robust predictions for multiple species by our predictor demonstrate that our predictive models have strong generalization ability. To the best of our knowledge, M6AMRFS is the first tool that can be used for the identification of m(6)A sites in multiple species. To facilitate the use of our predictor, we have established a user-friendly webserver with the implementation of M6AMRFS, which is currently available in http://server.malab.cn/M6AMRFS/. We anticipate that it will be a useful tool for the relevant research of m(6)A sites. |
format | Online Article Text |
id | pubmed-6209681 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-62096812018-11-08 M6AMRFS: Robust Prediction of N6-Methyladenosine Sites With Sequence-Based Features in Multiple Species Qiang, Xiaoli Chen, Huangrong Ye, Xiucai Su, Ran Wei, Leyi Front Genet Genetics As one of the well-studied RNA methylation modifications, N6-methyladenosine (m(6)A) plays important roles in various biological progresses, such as RNA splicing and degradation, etc. Identification of m(6)A sites is fundamentally important for better understanding of their functional mechanisms. Recently, machine learning based prediction methods have emerged as an effective approach for fast and accurate identification of m(6)A sites. In this paper, we proposed “M6AMRFS”, a new machine learning based predictor for the identification of m(6)A sites. In this predictor, we exploited a new feature representation algorithm to encode RNA sequences with two feature descriptors (dinucleotide binary encoding and Local position-specific dinucleotide frequency), and used the F-score algorithm combined with SFS (Sequential Forward Search) to enhance the feature representation ability. To predict m(6)A sites, we employed the eXtreme Gradient Boosting (XGBoost) algorithm to build a predictive model. Benchmarking results showed that the proposed predictor is competitive with the state-of-the art predictors. Importantly, robust predictions for multiple species by our predictor demonstrate that our predictive models have strong generalization ability. To the best of our knowledge, M6AMRFS is the first tool that can be used for the identification of m(6)A sites in multiple species. To facilitate the use of our predictor, we have established a user-friendly webserver with the implementation of M6AMRFS, which is currently available in http://server.malab.cn/M6AMRFS/. We anticipate that it will be a useful tool for the relevant research of m(6)A sites. Frontiers Media S.A. 2018-10-25 /pmc/articles/PMC6209681/ /pubmed/30410501 http://dx.doi.org/10.3389/fgene.2018.00495 Text en Copyright © 2018 Qiang, Chen, Ye, Su and Wei. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Genetics Qiang, Xiaoli Chen, Huangrong Ye, Xiucai Su, Ran Wei, Leyi M6AMRFS: Robust Prediction of N6-Methyladenosine Sites With Sequence-Based Features in Multiple Species |
title | M6AMRFS: Robust Prediction of N6-Methyladenosine Sites With Sequence-Based Features in Multiple Species |
title_full | M6AMRFS: Robust Prediction of N6-Methyladenosine Sites With Sequence-Based Features in Multiple Species |
title_fullStr | M6AMRFS: Robust Prediction of N6-Methyladenosine Sites With Sequence-Based Features in Multiple Species |
title_full_unstemmed | M6AMRFS: Robust Prediction of N6-Methyladenosine Sites With Sequence-Based Features in Multiple Species |
title_short | M6AMRFS: Robust Prediction of N6-Methyladenosine Sites With Sequence-Based Features in Multiple Species |
title_sort | m6amrfs: robust prediction of n6-methyladenosine sites with sequence-based features in multiple species |
topic | Genetics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6209681/ https://www.ncbi.nlm.nih.gov/pubmed/30410501 http://dx.doi.org/10.3389/fgene.2018.00495 |
work_keys_str_mv | AT qiangxiaoli m6amrfsrobustpredictionofn6methyladenosinesiteswithsequencebasedfeaturesinmultiplespecies AT chenhuangrong m6amrfsrobustpredictionofn6methyladenosinesiteswithsequencebasedfeaturesinmultiplespecies AT yexiucai m6amrfsrobustpredictionofn6methyladenosinesiteswithsequencebasedfeaturesinmultiplespecies AT suran m6amrfsrobustpredictionofn6methyladenosinesiteswithsequencebasedfeaturesinmultiplespecies AT weileyi m6amrfsrobustpredictionofn6methyladenosinesiteswithsequencebasedfeaturesinmultiplespecies |