Cargando…

Systematic Curation of miRBase Annotation Using Integrated Small RNA High-Throughput Sequencing Data for C. elegans and Drosophila

MicroRNAs (miRNAs) are a class of 20–23 nucleotide small RNAs that regulate gene expression post-transcriptionally in animals and plants. Annotation of miRNAs by the miRNA database (miRBase) has largely relied on computational approaches. As a result, many miRBase entries lack experimental validatio...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Xiangfeng, Liu, X. Shirley
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Research Foundation 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3268580/
https://www.ncbi.nlm.nih.gov/pubmed/22303321
http://dx.doi.org/10.3389/fgene.2011.00025
_version_ 1782222383348711424
author Wang, Xiangfeng
Liu, X. Shirley
author_facet Wang, Xiangfeng
Liu, X. Shirley
author_sort Wang, Xiangfeng
collection PubMed
description MicroRNAs (miRNAs) are a class of 20–23 nucleotide small RNAs that regulate gene expression post-transcriptionally in animals and plants. Annotation of miRNAs by the miRNA database (miRBase) has largely relied on computational approaches. As a result, many miRBase entries lack experimental validation, and discrepancies between miRBase annotation and actual miRNA sequences are often observed. In this study, we integrated the small RNA sequencing (smRNA-seq) datasets in Caenorhabditis elegans and Drosophila melanogaster and devised an analytical pipeline coupled with detailed manual inspection to curate miRNA annotation systematically in miRBase. Our analysis reveals 19 (17.0%) and 51 (31.3%) miRNAs entries with detectable smRNA-seq reads have mature sequence discrepancies in C. elegans and D. melanogaster, respectively. These discrepancies frequently occur either for conserved miRNA families whose mature sequences were predicted according to their homologous counterparts in other species or for miRNAs whose precursor miRNA (pre-miRNA) hairpins produce an abundance of multiple miRNA isoforms or variants. Our analysis shows that while Drosophila pre-miRNAs, on average, produce less than 60% accurate mature miRNA reads in addition to their 5′ and 3′ variant isoforms, the precision of miRNA processing in C. elegans is much higher, at over 90%. Based on the revised miRNA sequences, we analyzed expression patterns of the more conserved (MC) and less conserved (LC) miRNAs and found that, whereas MC miRNAs are often co-expressed at multiple developmental stages, LC miRNAs tend to be expressed specifically at fewer stages.
format Online
Article
Text
id pubmed-3268580
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher Frontiers Research Foundation
record_format MEDLINE/PubMed
spelling pubmed-32685802012-02-02 Systematic Curation of miRBase Annotation Using Integrated Small RNA High-Throughput Sequencing Data for C. elegans and Drosophila Wang, Xiangfeng Liu, X. Shirley Front Genet Genetics MicroRNAs (miRNAs) are a class of 20–23 nucleotide small RNAs that regulate gene expression post-transcriptionally in animals and plants. Annotation of miRNAs by the miRNA database (miRBase) has largely relied on computational approaches. As a result, many miRBase entries lack experimental validation, and discrepancies between miRBase annotation and actual miRNA sequences are often observed. In this study, we integrated the small RNA sequencing (smRNA-seq) datasets in Caenorhabditis elegans and Drosophila melanogaster and devised an analytical pipeline coupled with detailed manual inspection to curate miRNA annotation systematically in miRBase. Our analysis reveals 19 (17.0%) and 51 (31.3%) miRNAs entries with detectable smRNA-seq reads have mature sequence discrepancies in C. elegans and D. melanogaster, respectively. These discrepancies frequently occur either for conserved miRNA families whose mature sequences were predicted according to their homologous counterparts in other species or for miRNAs whose precursor miRNA (pre-miRNA) hairpins produce an abundance of multiple miRNA isoforms or variants. Our analysis shows that while Drosophila pre-miRNAs, on average, produce less than 60% accurate mature miRNA reads in addition to their 5′ and 3′ variant isoforms, the precision of miRNA processing in C. elegans is much higher, at over 90%. Based on the revised miRNA sequences, we analyzed expression patterns of the more conserved (MC) and less conserved (LC) miRNAs and found that, whereas MC miRNAs are often co-expressed at multiple developmental stages, LC miRNAs tend to be expressed specifically at fewer stages. Frontiers Research Foundation 2011-05-26 /pmc/articles/PMC3268580/ /pubmed/22303321 http://dx.doi.org/10.3389/fgene.2011.00025 Text en Copyright © 2011 Wang and Liu. http://www.frontiersin.org/licenseagreement This is an open-access article subject to a non-exclusive license between the authors and Frontiers Media SA, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and other Frontiers conditions are complied with.
spellingShingle Genetics
Wang, Xiangfeng
Liu, X. Shirley
Systematic Curation of miRBase Annotation Using Integrated Small RNA High-Throughput Sequencing Data for C. elegans and Drosophila
title Systematic Curation of miRBase Annotation Using Integrated Small RNA High-Throughput Sequencing Data for C. elegans and Drosophila
title_full Systematic Curation of miRBase Annotation Using Integrated Small RNA High-Throughput Sequencing Data for C. elegans and Drosophila
title_fullStr Systematic Curation of miRBase Annotation Using Integrated Small RNA High-Throughput Sequencing Data for C. elegans and Drosophila
title_full_unstemmed Systematic Curation of miRBase Annotation Using Integrated Small RNA High-Throughput Sequencing Data for C. elegans and Drosophila
title_short Systematic Curation of miRBase Annotation Using Integrated Small RNA High-Throughput Sequencing Data for C. elegans and Drosophila
title_sort systematic curation of mirbase annotation using integrated small rna high-throughput sequencing data for c. elegans and drosophila
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3268580/
https://www.ncbi.nlm.nih.gov/pubmed/22303321
http://dx.doi.org/10.3389/fgene.2011.00025
work_keys_str_mv AT wangxiangfeng systematiccurationofmirbaseannotationusingintegratedsmallrnahighthroughputsequencingdataforcelegansanddrosophila
AT liuxshirley systematiccurationofmirbaseannotationusingintegratedsmallrnahighthroughputsequencingdataforcelegansanddrosophila