Cargando…

Mammalian MicroRNA Prediction through a Support Vector Machine Model of Sequence and Structure

BACKGROUND: MicroRNAs (miRNAs) are endogenous small noncoding RNA gene products, on average 22 nt long, found in a wide variety of organisms. They play important regulatory roles by targeting mRNAs for degradation or translational repression. There are 377 known mouse miRNAs and 475 known human miRN...

Descripción completa

Detalles Bibliográficos
Autores principales: Sheng, Ying, Engström, Pär G., Lenhard, Boris
Formato: Texto
Lenguaje:English
Publicado: Public Library of Science 2007
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1978525/
https://www.ncbi.nlm.nih.gov/pubmed/17895987
http://dx.doi.org/10.1371/journal.pone.0000946
_version_ 1782135411758333952
author Sheng, Ying
Engström, Pär G.
Lenhard, Boris
author_facet Sheng, Ying
Engström, Pär G.
Lenhard, Boris
author_sort Sheng, Ying
collection PubMed
description BACKGROUND: MicroRNAs (miRNAs) are endogenous small noncoding RNA gene products, on average 22 nt long, found in a wide variety of organisms. They play important regulatory roles by targeting mRNAs for degradation or translational repression. There are 377 known mouse miRNAs and 475 known human miRNAs in the May 2007 release of the miRBase database, the majority of which are conserved between the two species. A number of recent reports imply that it is likely that many mammalian miRNAs remain to be discovered. The possibility that there are more of them expressed at lower levels or in more specialized expression contexts calls for the exploitation of genome sequence information to accelerate their discovery. METHODOLOGY/PRINCIPAL FINDINGS: In this article, we describe a computational method-mirCoS-that uses three support vector machine models sequentially to discover new miRNA candidates in mammalian genomes based on sequence, secondary structure, and conservation. mirCoS can efficiently detect the majority of known miRNAs and predicts an extensive set of hairpin structures based on human-mouse comparisons. In total, 3476 mouse candidates and 3441 human candidates were found. These hairpins are more similar to known miRNAs than to negative controls in several aspects not considered by the prediction algorithm. A significant fraction of predictions is supported by existing expression evidence. CONCLUSIONS/SIGNIFICANCE: Using a novel approach, mirCoS performs comparably to or better than existing miRNA prediction methods, and contributes a significant number of new candidate miRNAs for experimental verification.
format Text
id pubmed-1978525
institution National Center for Biotechnology Information
language English
publishDate 2007
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-19785252007-09-26 Mammalian MicroRNA Prediction through a Support Vector Machine Model of Sequence and Structure Sheng, Ying Engström, Pär G. Lenhard, Boris PLoS One Research Article BACKGROUND: MicroRNAs (miRNAs) are endogenous small noncoding RNA gene products, on average 22 nt long, found in a wide variety of organisms. They play important regulatory roles by targeting mRNAs for degradation or translational repression. There are 377 known mouse miRNAs and 475 known human miRNAs in the May 2007 release of the miRBase database, the majority of which are conserved between the two species. A number of recent reports imply that it is likely that many mammalian miRNAs remain to be discovered. The possibility that there are more of them expressed at lower levels or in more specialized expression contexts calls for the exploitation of genome sequence information to accelerate their discovery. METHODOLOGY/PRINCIPAL FINDINGS: In this article, we describe a computational method-mirCoS-that uses three support vector machine models sequentially to discover new miRNA candidates in mammalian genomes based on sequence, secondary structure, and conservation. mirCoS can efficiently detect the majority of known miRNAs and predicts an extensive set of hairpin structures based on human-mouse comparisons. In total, 3476 mouse candidates and 3441 human candidates were found. These hairpins are more similar to known miRNAs than to negative controls in several aspects not considered by the prediction algorithm. A significant fraction of predictions is supported by existing expression evidence. CONCLUSIONS/SIGNIFICANCE: Using a novel approach, mirCoS performs comparably to or better than existing miRNA prediction methods, and contributes a significant number of new candidate miRNAs for experimental verification. Public Library of Science 2007-09-26 /pmc/articles/PMC1978525/ /pubmed/17895987 http://dx.doi.org/10.1371/journal.pone.0000946 Text en Sheng et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Sheng, Ying
Engström, Pär G.
Lenhard, Boris
Mammalian MicroRNA Prediction through a Support Vector Machine Model of Sequence and Structure
title Mammalian MicroRNA Prediction through a Support Vector Machine Model of Sequence and Structure
title_full Mammalian MicroRNA Prediction through a Support Vector Machine Model of Sequence and Structure
title_fullStr Mammalian MicroRNA Prediction through a Support Vector Machine Model of Sequence and Structure
title_full_unstemmed Mammalian MicroRNA Prediction through a Support Vector Machine Model of Sequence and Structure
title_short Mammalian MicroRNA Prediction through a Support Vector Machine Model of Sequence and Structure
title_sort mammalian microrna prediction through a support vector machine model of sequence and structure
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1978525/
https://www.ncbi.nlm.nih.gov/pubmed/17895987
http://dx.doi.org/10.1371/journal.pone.0000946
work_keys_str_mv AT shengying mammalianmicrornapredictionthroughasupportvectormachinemodelofsequenceandstructure
AT engstromparg mammalianmicrornapredictionthroughasupportvectormachinemodelofsequenceandstructure
AT lenhardboris mammalianmicrornapredictionthroughasupportvectormachinemodelofsequenceandstructure