Cargando…

Handling multi-mapped reads in RNA-seq

Many eukaryotic genomes harbour large numbers of duplicated sequences, of diverse biotypes, resulting from several mechanisms including recombination, whole genome duplication and retro-transposition. Such repeated sequences complicate gene/transcript quantification during RNA-seq analysis due to re...

Descripción completa

Detalles Bibliográficos
Autores principales: Deschamps-Francoeur, Gabrielle, Simoneau, Joël, Scott, Michelle S.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Research Network of Computational and Structural Biotechnology 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7330433/
https://www.ncbi.nlm.nih.gov/pubmed/32637053
http://dx.doi.org/10.1016/j.csbj.2020.06.014
_version_ 1783553121102856192
author Deschamps-Francoeur, Gabrielle
Simoneau, Joël
Scott, Michelle S.
author_facet Deschamps-Francoeur, Gabrielle
Simoneau, Joël
Scott, Michelle S.
author_sort Deschamps-Francoeur, Gabrielle
collection PubMed
description Many eukaryotic genomes harbour large numbers of duplicated sequences, of diverse biotypes, resulting from several mechanisms including recombination, whole genome duplication and retro-transposition. Such repeated sequences complicate gene/transcript quantification during RNA-seq analysis due to reads mapping to more than one locus, sometimes involving genes embedded in other genes. Genes of different biotypes have dissimilar levels of sequence duplication, with long-noncoding RNAs and messenger RNAs sharing less sequence similarity to other genes than biotypes encoding shorter RNAs. Many strategies have been elaborated to handle these multi-mapped reads, resulting in increased accuracy in gene/transcript quantification, although separate tools are typically used to estimate the abundance of short and long genes due to their dissimilar characteristics. This review discusses the mechanisms leading to sequence duplication, the biotypes affected, the computational strategies employed to deal with multi-mapped reads and the challenges that still remain to be overcome.
format Online
Article
Text
id pubmed-7330433
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Research Network of Computational and Structural Biotechnology
record_format MEDLINE/PubMed
spelling pubmed-73304332020-07-06 Handling multi-mapped reads in RNA-seq Deschamps-Francoeur, Gabrielle Simoneau, Joël Scott, Michelle S. Comput Struct Biotechnol J Review Article Many eukaryotic genomes harbour large numbers of duplicated sequences, of diverse biotypes, resulting from several mechanisms including recombination, whole genome duplication and retro-transposition. Such repeated sequences complicate gene/transcript quantification during RNA-seq analysis due to reads mapping to more than one locus, sometimes involving genes embedded in other genes. Genes of different biotypes have dissimilar levels of sequence duplication, with long-noncoding RNAs and messenger RNAs sharing less sequence similarity to other genes than biotypes encoding shorter RNAs. Many strategies have been elaborated to handle these multi-mapped reads, resulting in increased accuracy in gene/transcript quantification, although separate tools are typically used to estimate the abundance of short and long genes due to their dissimilar characteristics. This review discusses the mechanisms leading to sequence duplication, the biotypes affected, the computational strategies employed to deal with multi-mapped reads and the challenges that still remain to be overcome. Research Network of Computational and Structural Biotechnology 2020-06-12 /pmc/articles/PMC7330433/ /pubmed/32637053 http://dx.doi.org/10.1016/j.csbj.2020.06.014 Text en © 2020 The Authors http://creativecommons.org/licenses/by-nc-nd/4.0/ This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Review Article
Deschamps-Francoeur, Gabrielle
Simoneau, Joël
Scott, Michelle S.
Handling multi-mapped reads in RNA-seq
title Handling multi-mapped reads in RNA-seq
title_full Handling multi-mapped reads in RNA-seq
title_fullStr Handling multi-mapped reads in RNA-seq
title_full_unstemmed Handling multi-mapped reads in RNA-seq
title_short Handling multi-mapped reads in RNA-seq
title_sort handling multi-mapped reads in rna-seq
topic Review Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7330433/
https://www.ncbi.nlm.nih.gov/pubmed/32637053
http://dx.doi.org/10.1016/j.csbj.2020.06.014
work_keys_str_mv AT deschampsfrancoeurgabrielle handlingmultimappedreadsinrnaseq
AT simoneaujoel handlingmultimappedreadsinrnaseq
AT scottmichelles handlingmultimappedreadsinrnaseq