Cargando…

IsoMiRmap: fast, deterministic and exhaustive mining of isomiRs from short RNA-seq datasets

MOTIVATION: MicroRNA (miRNA) precursor arms give rise to multiple isoforms simultaneously called ‘isomiRs.’ IsomiRs from the same arm typically differ by a few nucleotides at either their 5′ or 3′ termini or both. In humans, the identities and abundances of isomiRs depend on a person’s sex and genet...

Descripción completa

Detalles Bibliográficos
Autores principales: Loher, Phillipe, Karathanasis, Nestoras, Londin, Eric, F. Bray, Paul, Pliatsika, Venetia, Telonis, Aristeidis G., Rigoutsos, Isidore
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8317110/
https://www.ncbi.nlm.nih.gov/pubmed/33471076
http://dx.doi.org/10.1093/bioinformatics/btab016
_version_ 1783730007400513536
author Loher, Phillipe
Karathanasis, Nestoras
Londin, Eric
F. Bray, Paul
Pliatsika, Venetia
Telonis, Aristeidis G.
Rigoutsos, Isidore
author_facet Loher, Phillipe
Karathanasis, Nestoras
Londin, Eric
F. Bray, Paul
Pliatsika, Venetia
Telonis, Aristeidis G.
Rigoutsos, Isidore
author_sort Loher, Phillipe
collection PubMed
description MOTIVATION: MicroRNA (miRNA) precursor arms give rise to multiple isoforms simultaneously called ‘isomiRs.’ IsomiRs from the same arm typically differ by a few nucleotides at either their 5′ or 3′ termini or both. In humans, the identities and abundances of isomiRs depend on a person’s sex and genetic ancestry as well as on tissue type, tissue state and disease type/subtype. Moreover, nearly half of the time the most abundant isomiR differs from the miRNA sequence found in public databases. Accurate mining of isomiRs from deep sequencing data is thus important. RESULTS: We developed isoMiRmap, a fast, standalone, user-friendly mining tool that identifies and quantifies all isomiRs by directly processing short RNA-seq datasets. IsoMiRmap is a portable ‘plug-and-play’ tool, requires minimal setup, has modest computing and storage requirements, and can process an RNA-seq dataset with 50 million reads in just a few minutes on an average laptop. IsoMiRmap deterministically and exhaustively reports all isomiRs in a given deep sequencing dataset and quantifies them accurately (no double-counting). IsoMiRmap comprehensively reports all miRNA precursor locations from which an isomiR may be transcribed, tags as ‘ambiguous’ isomiRs whose sequences exist both inside and outside of the space of known miRNA sequences and reports the public identifiers of common single-nucleotide polymorphisms and documented somatic mutations that may be present in an isomiR. IsoMiRmap also identifies isomiRs with 3’ non-templated post-transcriptional additions. Compared to similar tools, isoMiRmap is the fastest, reports more bona fide isomiRs, and provides the most comprehensive information related to an isomiR’s transcriptional origin. AVAILABILITY AND IMPLEMENTATION: The codes for isoMiRmap are freely available at https://cm.jefferson.edu/isoMiRmap/ and https://github.com/TJU-CMC-Org/isoMiRmap/. IsomiR profiles for the datasets of the 1000 Genomes Project, spanning five population groups, and The Cancer Genome Atlas (TCGA), spanning 33 cancer studies, are also available at https://cm.jefferson.edu/isoMiRmap/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-8317110
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-83171102021-07-29 IsoMiRmap: fast, deterministic and exhaustive mining of isomiRs from short RNA-seq datasets Loher, Phillipe Karathanasis, Nestoras Londin, Eric F. Bray, Paul Pliatsika, Venetia Telonis, Aristeidis G. Rigoutsos, Isidore Bioinformatics Original Papers MOTIVATION: MicroRNA (miRNA) precursor arms give rise to multiple isoforms simultaneously called ‘isomiRs.’ IsomiRs from the same arm typically differ by a few nucleotides at either their 5′ or 3′ termini or both. In humans, the identities and abundances of isomiRs depend on a person’s sex and genetic ancestry as well as on tissue type, tissue state and disease type/subtype. Moreover, nearly half of the time the most abundant isomiR differs from the miRNA sequence found in public databases. Accurate mining of isomiRs from deep sequencing data is thus important. RESULTS: We developed isoMiRmap, a fast, standalone, user-friendly mining tool that identifies and quantifies all isomiRs by directly processing short RNA-seq datasets. IsoMiRmap is a portable ‘plug-and-play’ tool, requires minimal setup, has modest computing and storage requirements, and can process an RNA-seq dataset with 50 million reads in just a few minutes on an average laptop. IsoMiRmap deterministically and exhaustively reports all isomiRs in a given deep sequencing dataset and quantifies them accurately (no double-counting). IsoMiRmap comprehensively reports all miRNA precursor locations from which an isomiR may be transcribed, tags as ‘ambiguous’ isomiRs whose sequences exist both inside and outside of the space of known miRNA sequences and reports the public identifiers of common single-nucleotide polymorphisms and documented somatic mutations that may be present in an isomiR. IsoMiRmap also identifies isomiRs with 3’ non-templated post-transcriptional additions. Compared to similar tools, isoMiRmap is the fastest, reports more bona fide isomiRs, and provides the most comprehensive information related to an isomiR’s transcriptional origin. AVAILABILITY AND IMPLEMENTATION: The codes for isoMiRmap are freely available at https://cm.jefferson.edu/isoMiRmap/ and https://github.com/TJU-CMC-Org/isoMiRmap/. IsomiR profiles for the datasets of the 1000 Genomes Project, spanning five population groups, and The Cancer Genome Atlas (TCGA), spanning 33 cancer studies, are also available at https://cm.jefferson.edu/isoMiRmap/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2021-01-20 /pmc/articles/PMC8317110/ /pubmed/33471076 http://dx.doi.org/10.1093/bioinformatics/btab016 Text en © The Author(s) 2021. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) ), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Papers
Loher, Phillipe
Karathanasis, Nestoras
Londin, Eric
F. Bray, Paul
Pliatsika, Venetia
Telonis, Aristeidis G.
Rigoutsos, Isidore
IsoMiRmap: fast, deterministic and exhaustive mining of isomiRs from short RNA-seq datasets
title IsoMiRmap: fast, deterministic and exhaustive mining of isomiRs from short RNA-seq datasets
title_full IsoMiRmap: fast, deterministic and exhaustive mining of isomiRs from short RNA-seq datasets
title_fullStr IsoMiRmap: fast, deterministic and exhaustive mining of isomiRs from short RNA-seq datasets
title_full_unstemmed IsoMiRmap: fast, deterministic and exhaustive mining of isomiRs from short RNA-seq datasets
title_short IsoMiRmap: fast, deterministic and exhaustive mining of isomiRs from short RNA-seq datasets
title_sort isomirmap: fast, deterministic and exhaustive mining of isomirs from short rna-seq datasets
topic Original Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8317110/
https://www.ncbi.nlm.nih.gov/pubmed/33471076
http://dx.doi.org/10.1093/bioinformatics/btab016
work_keys_str_mv AT loherphillipe isomirmapfastdeterministicandexhaustiveminingofisomirsfromshortrnaseqdatasets
AT karathanasisnestoras isomirmapfastdeterministicandexhaustiveminingofisomirsfromshortrnaseqdatasets
AT londineric isomirmapfastdeterministicandexhaustiveminingofisomirsfromshortrnaseqdatasets
AT fbraypaul isomirmapfastdeterministicandexhaustiveminingofisomirsfromshortrnaseqdatasets
AT pliatsikavenetia isomirmapfastdeterministicandexhaustiveminingofisomirsfromshortrnaseqdatasets
AT telonisaristeidisg isomirmapfastdeterministicandexhaustiveminingofisomirsfromshortrnaseqdatasets
AT rigoutsosisidore isomirmapfastdeterministicandexhaustiveminingofisomirsfromshortrnaseqdatasets