Cargando…

PSI-MOUSE: Predicting Mouse Pseudouridine Sites From Sequence and Genome-Derived Features

Pseudouridine (Ψ) is the first discovered and the most prevalent posttranscriptional modification, which has been widely studied during the past decades. Pseudouridine was observed in almost all kinds of RNAs and shown to have important biological functions. Currently, the time-consuming and high-co...

Descripción completa

Detalles Bibliográficos
Autores principales: Song, Bowen, Chen, Kunqi, Tang, Yujiao, Ma, Jialin, Meng, Jia, Wei, Zhen
Formato: Online Artículo Texto
Lenguaje:English
Publicado: SAGE Publications 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7285933/
https://www.ncbi.nlm.nih.gov/pubmed/32565674
http://dx.doi.org/10.1177/1176934320925752
_version_ 1783544789545779200
author Song, Bowen
Chen, Kunqi
Tang, Yujiao
Ma, Jialin
Meng, Jia
Wei, Zhen
author_facet Song, Bowen
Chen, Kunqi
Tang, Yujiao
Ma, Jialin
Meng, Jia
Wei, Zhen
author_sort Song, Bowen
collection PubMed
description Pseudouridine (Ψ) is the first discovered and the most prevalent posttranscriptional modification, which has been widely studied during the past decades. Pseudouridine was observed in almost all kinds of RNAs and shown to have important biological functions. Currently, the time-consuming and high-cost procedures of experimental approaches limit its uses in real-life Ψ site detection. Alternatively, by taking advantage of the explosive growth of Ψ sequencing data, the computational methods may provide a more cost-effective avenue. To date, the existing mouse Ψ site predictors were all developed based on sequence-derived features, and their performance can be further improved by adding the domain knowledge derived feature. Therefore, it is highly desirable to propose a genomic feature-based computational method to increase the accuracy and efficiency of the identification of Ψ RNA modification in the mouse transcriptome. In our study, a predictive framework PSI-MOUSE was built. Besides the conventional sequence-based features, PSI-MOUSE first introduced 38 additional genomic features derived from the mouse genome, which achieved a satisfactory improvement in the prediction performance, compared with other existing models. Moreover, PSI-MOUSE also features in automatically annotating the putative Ψ sites with diverse types of posttranscriptional regulations (RNA-binding protein [RBP]-binding regions, miRNA-RNA interactions, and splicing sites), which can serve as a useful research tool for the study of Ψ RNA modification in the mouse genome. Finally, 3282 experimentally validated mouse Ψ sites were also collected in a database with customized query functions. For the convenience of academic users, a website was built to provide a user-friendly interface for the query and analysis on the database. The website is freely accessible at www.xjtlu.edu.cn/biologicalsciences/psimouse and http://psimouse.rnamd.com. We introduced the genome-derived features to mouse for the first time, and we achieved a good performance in mouse Ψ site prediction. Compared with the existing state-of-art methods, our newly developed approach PSI-MOUSE obtained a substantial improvement in prediction accuracy, marking the reliable contributions of genomic features for the prediction of RNA modifications in a species other than human.
format Online
Article
Text
id pubmed-7285933
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher SAGE Publications
record_format MEDLINE/PubMed
spelling pubmed-72859332020-06-19 PSI-MOUSE: Predicting Mouse Pseudouridine Sites From Sequence and Genome-Derived Features Song, Bowen Chen, Kunqi Tang, Yujiao Ma, Jialin Meng, Jia Wei, Zhen Evol Bioinform Online Original Research Pseudouridine (Ψ) is the first discovered and the most prevalent posttranscriptional modification, which has been widely studied during the past decades. Pseudouridine was observed in almost all kinds of RNAs and shown to have important biological functions. Currently, the time-consuming and high-cost procedures of experimental approaches limit its uses in real-life Ψ site detection. Alternatively, by taking advantage of the explosive growth of Ψ sequencing data, the computational methods may provide a more cost-effective avenue. To date, the existing mouse Ψ site predictors were all developed based on sequence-derived features, and their performance can be further improved by adding the domain knowledge derived feature. Therefore, it is highly desirable to propose a genomic feature-based computational method to increase the accuracy and efficiency of the identification of Ψ RNA modification in the mouse transcriptome. In our study, a predictive framework PSI-MOUSE was built. Besides the conventional sequence-based features, PSI-MOUSE first introduced 38 additional genomic features derived from the mouse genome, which achieved a satisfactory improvement in the prediction performance, compared with other existing models. Moreover, PSI-MOUSE also features in automatically annotating the putative Ψ sites with diverse types of posttranscriptional regulations (RNA-binding protein [RBP]-binding regions, miRNA-RNA interactions, and splicing sites), which can serve as a useful research tool for the study of Ψ RNA modification in the mouse genome. Finally, 3282 experimentally validated mouse Ψ sites were also collected in a database with customized query functions. For the convenience of academic users, a website was built to provide a user-friendly interface for the query and analysis on the database. The website is freely accessible at www.xjtlu.edu.cn/biologicalsciences/psimouse and http://psimouse.rnamd.com. We introduced the genome-derived features to mouse for the first time, and we achieved a good performance in mouse Ψ site prediction. Compared with the existing state-of-art methods, our newly developed approach PSI-MOUSE obtained a substantial improvement in prediction accuracy, marking the reliable contributions of genomic features for the prediction of RNA modifications in a species other than human. SAGE Publications 2020-06-09 /pmc/articles/PMC7285933/ /pubmed/32565674 http://dx.doi.org/10.1177/1176934320925752 Text en © The Author(s) 2020 https://creativecommons.org/licenses/by-nc/4.0/ This article is distributed under the terms of the Creative Commons Attribution-NonCommercial 4.0 License (https://creativecommons.org/licenses/by-nc/4.0/) which permits non-commercial use, reproduction and distribution of the work without further permission provided the original work is attributed as specified on the SAGE and Open Access pages (https://us.sagepub.com/en-us/nam/open-access-at-sage).
spellingShingle Original Research
Song, Bowen
Chen, Kunqi
Tang, Yujiao
Ma, Jialin
Meng, Jia
Wei, Zhen
PSI-MOUSE: Predicting Mouse Pseudouridine Sites From Sequence and Genome-Derived Features
title PSI-MOUSE: Predicting Mouse Pseudouridine Sites From Sequence and Genome-Derived Features
title_full PSI-MOUSE: Predicting Mouse Pseudouridine Sites From Sequence and Genome-Derived Features
title_fullStr PSI-MOUSE: Predicting Mouse Pseudouridine Sites From Sequence and Genome-Derived Features
title_full_unstemmed PSI-MOUSE: Predicting Mouse Pseudouridine Sites From Sequence and Genome-Derived Features
title_short PSI-MOUSE: Predicting Mouse Pseudouridine Sites From Sequence and Genome-Derived Features
title_sort psi-mouse: predicting mouse pseudouridine sites from sequence and genome-derived features
topic Original Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7285933/
https://www.ncbi.nlm.nih.gov/pubmed/32565674
http://dx.doi.org/10.1177/1176934320925752
work_keys_str_mv AT songbowen psimousepredictingmousepseudouridinesitesfromsequenceandgenomederivedfeatures
AT chenkunqi psimousepredictingmousepseudouridinesitesfromsequenceandgenomederivedfeatures
AT tangyujiao psimousepredictingmousepseudouridinesitesfromsequenceandgenomederivedfeatures
AT majialin psimousepredictingmousepseudouridinesitesfromsequenceandgenomederivedfeatures
AT mengjia psimousepredictingmousepseudouridinesitesfromsequenceandgenomederivedfeatures
AT weizhen psimousepredictingmousepseudouridinesitesfromsequenceandgenomederivedfeatures