Cargando…

Human microRNA prediction through a probabilistic co-learning model of sequence and structure

MicroRNAs (miRNAs) are small regulatory RNAs of ∼22 nt. Although hundreds of miRNAs have been identified through experimental complementary DNA cloning methods and computational efforts, previous approaches could detect only abundantly expressed miRNAs or close homologs of previously identified miRN...

Descripción completa

Detalles Bibliográficos
Autores principales: Nam, Jin-Wu, Shin, Ki-Roo, Han, Jinju, Lee, Yoontae, Kim, V. Narry, Zhang, Byoung-Tak
Formato: Texto
Lenguaje:English
Publicado: Oxford University Press 2005
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1159118/
https://www.ncbi.nlm.nih.gov/pubmed/15987789
http://dx.doi.org/10.1093/nar/gki668
_version_ 1782124349459791872
author Nam, Jin-Wu
Shin, Ki-Roo
Han, Jinju
Lee, Yoontae
Kim, V. Narry
Zhang, Byoung-Tak
author_facet Nam, Jin-Wu
Shin, Ki-Roo
Han, Jinju
Lee, Yoontae
Kim, V. Narry
Zhang, Byoung-Tak
author_sort Nam, Jin-Wu
collection PubMed
description MicroRNAs (miRNAs) are small regulatory RNAs of ∼22 nt. Although hundreds of miRNAs have been identified through experimental complementary DNA cloning methods and computational efforts, previous approaches could detect only abundantly expressed miRNAs or close homologs of previously identified miRNAs. Here, we introduce a probabilistic co-learning model for miRNA gene finding, ProMiR, which simultaneously considers the structure and sequence of miRNA precursors (pre-miRNAs). On 5-fold cross-validation with 136 referenced human datasets, the efficiency of the classification shows 73% sensitivity and 96% specificity. When applied to genome screening for novel miRNAs on human chromosomes 16, 17, 18 and 19, ProMiR effectively searches distantly homologous patterns over diverse pre-miRNAs, detecting at least 23 novel miRNA gene candidates. Importantly, the miRNA gene candidates do not demonstrate clear sequence similarity to the known miRNA genes. By quantitative PCR followed by RNA interference against Drosha, we experimentally confirmed that 9 of the 23 representative candidate genes express transcripts that are processed by the miRNA biogenesis enzyme Drosha in HeLa cells, indicating that ProMiR may successfully predict miRNA genes with at least 40% accuracy. Our study suggests that the miRNA gene family may be more abundant than previously anticipated, and confer highly extensive regulatory networks on eukaryotic cells.
format Text
id pubmed-1159118
institution National Center for Biotechnology Information
language English
publishDate 2005
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-11591182005-06-24 Human microRNA prediction through a probabilistic co-learning model of sequence and structure Nam, Jin-Wu Shin, Ki-Roo Han, Jinju Lee, Yoontae Kim, V. Narry Zhang, Byoung-Tak Nucleic Acids Res Article MicroRNAs (miRNAs) are small regulatory RNAs of ∼22 nt. Although hundreds of miRNAs have been identified through experimental complementary DNA cloning methods and computational efforts, previous approaches could detect only abundantly expressed miRNAs or close homologs of previously identified miRNAs. Here, we introduce a probabilistic co-learning model for miRNA gene finding, ProMiR, which simultaneously considers the structure and sequence of miRNA precursors (pre-miRNAs). On 5-fold cross-validation with 136 referenced human datasets, the efficiency of the classification shows 73% sensitivity and 96% specificity. When applied to genome screening for novel miRNAs on human chromosomes 16, 17, 18 and 19, ProMiR effectively searches distantly homologous patterns over diverse pre-miRNAs, detecting at least 23 novel miRNA gene candidates. Importantly, the miRNA gene candidates do not demonstrate clear sequence similarity to the known miRNA genes. By quantitative PCR followed by RNA interference against Drosha, we experimentally confirmed that 9 of the 23 representative candidate genes express transcripts that are processed by the miRNA biogenesis enzyme Drosha in HeLa cells, indicating that ProMiR may successfully predict miRNA genes with at least 40% accuracy. Our study suggests that the miRNA gene family may be more abundant than previously anticipated, and confer highly extensive regulatory networks on eukaryotic cells. Oxford University Press 2005 2005-06-24 /pmc/articles/PMC1159118/ /pubmed/15987789 http://dx.doi.org/10.1093/nar/gki668 Text en © The Author 2005. Published by Oxford University Press. All rights reserved
spellingShingle Article
Nam, Jin-Wu
Shin, Ki-Roo
Han, Jinju
Lee, Yoontae
Kim, V. Narry
Zhang, Byoung-Tak
Human microRNA prediction through a probabilistic co-learning model of sequence and structure
title Human microRNA prediction through a probabilistic co-learning model of sequence and structure
title_full Human microRNA prediction through a probabilistic co-learning model of sequence and structure
title_fullStr Human microRNA prediction through a probabilistic co-learning model of sequence and structure
title_full_unstemmed Human microRNA prediction through a probabilistic co-learning model of sequence and structure
title_short Human microRNA prediction through a probabilistic co-learning model of sequence and structure
title_sort human microrna prediction through a probabilistic co-learning model of sequence and structure
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1159118/
https://www.ncbi.nlm.nih.gov/pubmed/15987789
http://dx.doi.org/10.1093/nar/gki668
work_keys_str_mv AT namjinwu humanmicrornapredictionthroughaprobabilisticcolearningmodelofsequenceandstructure
AT shinkiroo humanmicrornapredictionthroughaprobabilisticcolearningmodelofsequenceandstructure
AT hanjinju humanmicrornapredictionthroughaprobabilisticcolearningmodelofsequenceandstructure
AT leeyoontae humanmicrornapredictionthroughaprobabilisticcolearningmodelofsequenceandstructure
AT kimvnarry humanmicrornapredictionthroughaprobabilisticcolearningmodelofsequenceandstructure
AT zhangbyoungtak humanmicrornapredictionthroughaprobabilisticcolearningmodelofsequenceandstructure