Cargando…

PseUI: Pseudouridine sites identification based on RNA sequence information

BACKGROUND: Pseudouridylation is the most prevalent type of posttranscriptional modification in various stable RNAs of all organisms, which significantly affects many cellular processes that are regulated by RNA. Thus, accurate identification of pseudouridine (Ψ) sites in RNA will be of great benefi...

Descripción completa

Detalles Bibliográficos
Autores principales:	He, Jingjing, Fang, Ting, Zhang, Zizheng, Huang, Bei, Zhu, Xiaolei, Xiong, Yi
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2018
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6114832/ https://www.ncbi.nlm.nih.gov/pubmed/30157750 http://dx.doi.org/10.1186/s12859-018-2321-0

_version_	1783351267038330880
author	He, Jingjing Fang, Ting Zhang, Zizheng Huang, Bei Zhu, Xiaolei Xiong, Yi
author_facet	He, Jingjing Fang, Ting Zhang, Zizheng Huang, Bei Zhu, Xiaolei Xiong, Yi
author_sort	He, Jingjing
collection	PubMed
description	BACKGROUND: Pseudouridylation is the most prevalent type of posttranscriptional modification in various stable RNAs of all organisms, which significantly affects many cellular processes that are regulated by RNA. Thus, accurate identification of pseudouridine (Ψ) sites in RNA will be of great benefit for understanding these cellular processes. Due to the low efficiency and high cost of current available experimental methods, it is highly desirable to develop computational methods for accurately and efficiently detecting Ψ sites in RNA sequences. However, the predictive accuracy of existing computational methods is not satisfactory and still needs improvement. RESULTS: In this study, we developed a new model, PseUI, for Ψ sites identification in three species, which are H. sapiens, S. cerevisiae, and M. musculus. Firstly, five different kinds of features including nucleotide composition (NC), dinucleotide composition (DC), pseudo dinucleotide composition (pseDNC), position-specific nucleotide propensity (PSNP), and position-specific dinucleotide propensity (PSDP) were generated based on RNA segments. Then, a sequential forward feature selection strategy was used to gain an effective feature subset with a compact representation but discriminative prediction power. Based on the selected feature subsets, we built our model by using a support vector machine (SVM). Finally, the generalization of our model was validated by both the jackknife test and independent validation tests on the benchmark datasets. The experimental results showed that our model is more accurate and stable than the previously published models. We have also provided a user-friendly web server for our model at http://zhulab.ahu.edu.cn/PseUI, and a brief instruction for the web server is provided in this paper. By using this instruction, the academic users can conveniently get their desired results without complicated calculations. CONCLUSION: In this study, we proposed a new predictor, PseUI, to detect Ψ sites in RNA sequences. It is shown that our model outperformed the existing state-of-art models. It is expected that our model, PseUI, will become a useful tool for accurate identification of RNA Ψ sites. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-018-2321-0) contains supplementary material, which is available to authorized users.
format	Online Article Text
id	pubmed-6114832
institution	National Center for Biotechnology Information
language	English
publishDate	2018
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-61148322018-09-04 PseUI: Pseudouridine sites identification based on RNA sequence information He, Jingjing Fang, Ting Zhang, Zizheng Huang, Bei Zhu, Xiaolei Xiong, Yi BMC Bioinformatics Research Article BACKGROUND: Pseudouridylation is the most prevalent type of posttranscriptional modification in various stable RNAs of all organisms, which significantly affects many cellular processes that are regulated by RNA. Thus, accurate identification of pseudouridine (Ψ) sites in RNA will be of great benefit for understanding these cellular processes. Due to the low efficiency and high cost of current available experimental methods, it is highly desirable to develop computational methods for accurately and efficiently detecting Ψ sites in RNA sequences. However, the predictive accuracy of existing computational methods is not satisfactory and still needs improvement. RESULTS: In this study, we developed a new model, PseUI, for Ψ sites identification in three species, which are H. sapiens, S. cerevisiae, and M. musculus. Firstly, five different kinds of features including nucleotide composition (NC), dinucleotide composition (DC), pseudo dinucleotide composition (pseDNC), position-specific nucleotide propensity (PSNP), and position-specific dinucleotide propensity (PSDP) were generated based on RNA segments. Then, a sequential forward feature selection strategy was used to gain an effective feature subset with a compact representation but discriminative prediction power. Based on the selected feature subsets, we built our model by using a support vector machine (SVM). Finally, the generalization of our model was validated by both the jackknife test and independent validation tests on the benchmark datasets. The experimental results showed that our model is more accurate and stable than the previously published models. We have also provided a user-friendly web server for our model at http://zhulab.ahu.edu.cn/PseUI, and a brief instruction for the web server is provided in this paper. By using this instruction, the academic users can conveniently get their desired results without complicated calculations. CONCLUSION: In this study, we proposed a new predictor, PseUI, to detect Ψ sites in RNA sequences. It is shown that our model outperformed the existing state-of-art models. It is expected that our model, PseUI, will become a useful tool for accurate identification of RNA Ψ sites. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-018-2321-0) contains supplementary material, which is available to authorized users. BioMed Central 2018-08-29 /pmc/articles/PMC6114832/ /pubmed/30157750 http://dx.doi.org/10.1186/s12859-018-2321-0 Text en © The Author(s). 2018 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle	Research Article He, Jingjing Fang, Ting Zhang, Zizheng Huang, Bei Zhu, Xiaolei Xiong, Yi PseUI: Pseudouridine sites identification based on RNA sequence information
title	PseUI: Pseudouridine sites identification based on RNA sequence information
title_full	PseUI: Pseudouridine sites identification based on RNA sequence information
title_fullStr	PseUI: Pseudouridine sites identification based on RNA sequence information
title_full_unstemmed	PseUI: Pseudouridine sites identification based on RNA sequence information
title_short	PseUI: Pseudouridine sites identification based on RNA sequence information
title_sort	pseui: pseudouridine sites identification based on rna sequence information
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6114832/ https://www.ncbi.nlm.nih.gov/pubmed/30157750 http://dx.doi.org/10.1186/s12859-018-2321-0
work_keys_str_mv	AT hejingjing pseuipseudouridinesitesidentificationbasedonrnasequenceinformation AT fangting pseuipseudouridinesitesidentificationbasedonrnasequenceinformation AT zhangzizheng pseuipseudouridinesitesidentificationbasedonrnasequenceinformation AT huangbei pseuipseudouridinesitesidentificationbasedonrnasequenceinformation AT zhuxiaolei pseuipseudouridinesitesidentificationbasedonrnasequenceinformation AT xiongyi pseuipseudouridinesitesidentificationbasedonrnasequenceinformation

PseUI: Pseudouridine sites identification based on RNA sequence information

Ejemplares similares