Cargando…

Incorporating Evolutionary Information and Functional Domains for Identifying RNA Splicing Factors in Humans

Regulation of pre-mRNA splicing is achieved through the interaction of RNA sequence elements and a variety of RNA-splicing related proteins (splicing factors). The splicing machinery in humans is not yet fully elucidated, partly because splicing factors in humans have not been exhaustively identifie...

Descripción completa

Detalles Bibliográficos
Autores principales: Hsu, Justin Bo-Kai, Bretaña, Neil Arvin, Lee, Tzong-Yi, Huang, Hsien-Da
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3217973/
https://www.ncbi.nlm.nih.gov/pubmed/22110674
http://dx.doi.org/10.1371/journal.pone.0027567
_version_ 1782216644278353920
author Hsu, Justin Bo-Kai
Bretaña, Neil Arvin
Lee, Tzong-Yi
Huang, Hsien-Da
author_facet Hsu, Justin Bo-Kai
Bretaña, Neil Arvin
Lee, Tzong-Yi
Huang, Hsien-Da
author_sort Hsu, Justin Bo-Kai
collection PubMed
description Regulation of pre-mRNA splicing is achieved through the interaction of RNA sequence elements and a variety of RNA-splicing related proteins (splicing factors). The splicing machinery in humans is not yet fully elucidated, partly because splicing factors in humans have not been exhaustively identified. Furthermore, experimental methods for splicing factor identification are time-consuming and lab-intensive. Although many computational methods have been proposed for the identification of RNA-binding proteins, there exists no development that focuses on the identification of RNA-splicing related proteins so far. Therefore, we are motivated to design a method that focuses on the identification of human splicing factors using experimentally verified splicing factors. The investigation of amino acid composition reveals that there are remarkable differences between splicing factors and non-splicing proteins. A support vector machine (SVM) is utilized to construct a predictive model, and the five-fold cross-validation evaluation indicates that the SVM model trained with amino acid composition could provide a promising accuracy (80.22%). Another basic feature, amino acid dipeptide composition, is also examined to yield a similar predictive performance to amino acid composition. In addition, this work presents that the incorporation of evolutionary information and domain information could improve the predictive performance. The constructed models have been demonstrated to effectively classify (73.65% accuracy) an independent data set of human splicing factors. The result of independent testing indicates that in silico identification could be a feasible means of conducting preliminary analyses of splicing factors and significantly reducing the number of potential targets that require further in vivo or in vitro confirmation.
format Online
Article
Text
id pubmed-3217973
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-32179732011-11-21 Incorporating Evolutionary Information and Functional Domains for Identifying RNA Splicing Factors in Humans Hsu, Justin Bo-Kai Bretaña, Neil Arvin Lee, Tzong-Yi Huang, Hsien-Da PLoS One Research Article Regulation of pre-mRNA splicing is achieved through the interaction of RNA sequence elements and a variety of RNA-splicing related proteins (splicing factors). The splicing machinery in humans is not yet fully elucidated, partly because splicing factors in humans have not been exhaustively identified. Furthermore, experimental methods for splicing factor identification are time-consuming and lab-intensive. Although many computational methods have been proposed for the identification of RNA-binding proteins, there exists no development that focuses on the identification of RNA-splicing related proteins so far. Therefore, we are motivated to design a method that focuses on the identification of human splicing factors using experimentally verified splicing factors. The investigation of amino acid composition reveals that there are remarkable differences between splicing factors and non-splicing proteins. A support vector machine (SVM) is utilized to construct a predictive model, and the five-fold cross-validation evaluation indicates that the SVM model trained with amino acid composition could provide a promising accuracy (80.22%). Another basic feature, amino acid dipeptide composition, is also examined to yield a similar predictive performance to amino acid composition. In addition, this work presents that the incorporation of evolutionary information and domain information could improve the predictive performance. The constructed models have been demonstrated to effectively classify (73.65% accuracy) an independent data set of human splicing factors. The result of independent testing indicates that in silico identification could be a feasible means of conducting preliminary analyses of splicing factors and significantly reducing the number of potential targets that require further in vivo or in vitro confirmation. Public Library of Science 2011-11-16 /pmc/articles/PMC3217973/ /pubmed/22110674 http://dx.doi.org/10.1371/journal.pone.0027567 Text en Hsu et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Hsu, Justin Bo-Kai
Bretaña, Neil Arvin
Lee, Tzong-Yi
Huang, Hsien-Da
Incorporating Evolutionary Information and Functional Domains for Identifying RNA Splicing Factors in Humans
title Incorporating Evolutionary Information and Functional Domains for Identifying RNA Splicing Factors in Humans
title_full Incorporating Evolutionary Information and Functional Domains for Identifying RNA Splicing Factors in Humans
title_fullStr Incorporating Evolutionary Information and Functional Domains for Identifying RNA Splicing Factors in Humans
title_full_unstemmed Incorporating Evolutionary Information and Functional Domains for Identifying RNA Splicing Factors in Humans
title_short Incorporating Evolutionary Information and Functional Domains for Identifying RNA Splicing Factors in Humans
title_sort incorporating evolutionary information and functional domains for identifying rna splicing factors in humans
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3217973/
https://www.ncbi.nlm.nih.gov/pubmed/22110674
http://dx.doi.org/10.1371/journal.pone.0027567
work_keys_str_mv AT hsujustinbokai incorporatingevolutionaryinformationandfunctionaldomainsforidentifyingrnasplicingfactorsinhumans
AT bretananeilarvin incorporatingevolutionaryinformationandfunctionaldomainsforidentifyingrnasplicingfactorsinhumans
AT leetzongyi incorporatingevolutionaryinformationandfunctionaldomainsforidentifyingrnasplicingfactorsinhumans
AT huanghsienda incorporatingevolutionaryinformationandfunctionaldomainsforidentifyingrnasplicingfactorsinhumans