Cargando…

Classification of real and pseudo microRNA precursors using local structure-sequence features and support vector machine

BACKGROUND: MicroRNAs (miRNAs) are a group of short (~22 nt) non-coding RNAs that play important regulatory roles. MiRNA precursors (pre-miRNAs) are characterized by their hairpin structures. However, a large amount of similar hairpins can be folded in many genomes. Almost all current methods for co...

Descripción completa

Detalles Bibliográficos
Autores principales: Xue, Chenghai, Li, Fei, He, Tao, Liu, Guo-Ping, Li, Yanda, Zhang, Xuegong
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2005
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1360673/
https://www.ncbi.nlm.nih.gov/pubmed/16381612
http://dx.doi.org/10.1186/1471-2105-6-310
_version_ 1782126705177001984
author Xue, Chenghai
Li, Fei
He, Tao
Liu, Guo-Ping
Li, Yanda
Zhang, Xuegong
author_facet Xue, Chenghai
Li, Fei
He, Tao
Liu, Guo-Ping
Li, Yanda
Zhang, Xuegong
author_sort Xue, Chenghai
collection PubMed
description BACKGROUND: MicroRNAs (miRNAs) are a group of short (~22 nt) non-coding RNAs that play important regulatory roles. MiRNA precursors (pre-miRNAs) are characterized by their hairpin structures. However, a large amount of similar hairpins can be folded in many genomes. Almost all current methods for computational prediction of miRNAs use comparative genomic approaches to identify putative pre-miRNAs from candidate hairpins. Ab initio method for distinguishing pre-miRNAs from sequence segments with pre-miRNA-like hairpin structures is lacking. Being able to classify real vs. pseudo pre-miRNAs is important both for understanding of the nature of miRNAs and for developing ab initio prediction methods that can discovery new miRNAs without known homology. RESULTS: A set of novel features of local contiguous structure-sequence information is proposed for distinguishing the hairpins of real pre-miRNAs and pseudo pre-miRNAs. Support vector machine (SVM) is applied on these features to classify real vs. pseudo pre-miRNAs, achieving about 90% accuracy on human data. Remarkably, the SVM classifier built on human data can correctly identify up to 90% of the pre-miRNAs from other species, including plants and virus, without utilizing any comparative genomics information. CONCLUSION: The local structure-sequence features reflect discriminative and conserved characteristics of miRNAs, and the successful ab initio classification of real and pseudo pre-miRNAs opens a new approach for discovering new miRNAs.
format Text
id pubmed-1360673
institution National Center for Biotechnology Information
language English
publishDate 2005
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-13606732006-02-10 Classification of real and pseudo microRNA precursors using local structure-sequence features and support vector machine Xue, Chenghai Li, Fei He, Tao Liu, Guo-Ping Li, Yanda Zhang, Xuegong BMC Bioinformatics Methodology Article BACKGROUND: MicroRNAs (miRNAs) are a group of short (~22 nt) non-coding RNAs that play important regulatory roles. MiRNA precursors (pre-miRNAs) are characterized by their hairpin structures. However, a large amount of similar hairpins can be folded in many genomes. Almost all current methods for computational prediction of miRNAs use comparative genomic approaches to identify putative pre-miRNAs from candidate hairpins. Ab initio method for distinguishing pre-miRNAs from sequence segments with pre-miRNA-like hairpin structures is lacking. Being able to classify real vs. pseudo pre-miRNAs is important both for understanding of the nature of miRNAs and for developing ab initio prediction methods that can discovery new miRNAs without known homology. RESULTS: A set of novel features of local contiguous structure-sequence information is proposed for distinguishing the hairpins of real pre-miRNAs and pseudo pre-miRNAs. Support vector machine (SVM) is applied on these features to classify real vs. pseudo pre-miRNAs, achieving about 90% accuracy on human data. Remarkably, the SVM classifier built on human data can correctly identify up to 90% of the pre-miRNAs from other species, including plants and virus, without utilizing any comparative genomics information. CONCLUSION: The local structure-sequence features reflect discriminative and conserved characteristics of miRNAs, and the successful ab initio classification of real and pseudo pre-miRNAs opens a new approach for discovering new miRNAs. BioMed Central 2005-12-29 /pmc/articles/PMC1360673/ /pubmed/16381612 http://dx.doi.org/10.1186/1471-2105-6-310 Text en Copyright © 2005 Xue et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Xue, Chenghai
Li, Fei
He, Tao
Liu, Guo-Ping
Li, Yanda
Zhang, Xuegong
Classification of real and pseudo microRNA precursors using local structure-sequence features and support vector machine
title Classification of real and pseudo microRNA precursors using local structure-sequence features and support vector machine
title_full Classification of real and pseudo microRNA precursors using local structure-sequence features and support vector machine
title_fullStr Classification of real and pseudo microRNA precursors using local structure-sequence features and support vector machine
title_full_unstemmed Classification of real and pseudo microRNA precursors using local structure-sequence features and support vector machine
title_short Classification of real and pseudo microRNA precursors using local structure-sequence features and support vector machine
title_sort classification of real and pseudo microrna precursors using local structure-sequence features and support vector machine
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1360673/
https://www.ncbi.nlm.nih.gov/pubmed/16381612
http://dx.doi.org/10.1186/1471-2105-6-310
work_keys_str_mv AT xuechenghai classificationofrealandpseudomicrornaprecursorsusinglocalstructuresequencefeaturesandsupportvectormachine
AT lifei classificationofrealandpseudomicrornaprecursorsusinglocalstructuresequencefeaturesandsupportvectormachine
AT hetao classificationofrealandpseudomicrornaprecursorsusinglocalstructuresequencefeaturesandsupportvectormachine
AT liuguoping classificationofrealandpseudomicrornaprecursorsusinglocalstructuresequencefeaturesandsupportvectormachine
AT liyanda classificationofrealandpseudomicrornaprecursorsusinglocalstructuresequencefeaturesandsupportvectormachine
AT zhangxuegong classificationofrealandpseudomicrornaprecursorsusinglocalstructuresequencefeaturesandsupportvectormachine