Cargando…

A context-based approach to identify the most likely mapping for RNA-seq experiments

BACKGROUND: Sequencing of mRNA (RNA-seq) by next generation sequencing technologies is widely used for analyzing the transcriptomic state of a cell. Here, one of the main challenges is the mapping of a sequenced read to its transcriptomic origin. As a simple alignment to the genome will fail to iden...

Descripción completa

Detalles Bibliográficos
Autores principales:	Bonfert, Thomas, Csaba, Gergely, Zimmer, Ralf, Friedel, Caroline C
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2012
Materias:	Proceedings
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3358662/ https://www.ncbi.nlm.nih.gov/pubmed/22537048 http://dx.doi.org/10.1186/1471-2105-13-S6-S9

_version_	1782233795985932288
author	Bonfert, Thomas Csaba, Gergely Zimmer, Ralf Friedel, Caroline C
author_facet	Bonfert, Thomas Csaba, Gergely Zimmer, Ralf Friedel, Caroline C
author_sort	Bonfert, Thomas
collection	PubMed
description	BACKGROUND: Sequencing of mRNA (RNA-seq) by next generation sequencing technologies is widely used for analyzing the transcriptomic state of a cell. Here, one of the main challenges is the mapping of a sequenced read to its transcriptomic origin. As a simple alignment to the genome will fail to identify reads crossing splice junctions and a transcriptome alignment will miss novel splice sites, several approaches have been developed for this purpose. Most of these approaches have two drawbacks. First, each read is assigned to a location independent on whether the corresponding gene is expressed or not, i.e. information from other reads is not taken into account. Second, in case of multiple possible mappings, the mapping with the fewest mismatches is usually chosen which may lead to wrong assignments due to sequencing errors. RESULTS: To address these problems, we developed ContextMap which efficiently uses information on the context of a read, i.e. reads mapping to the same expressed region. The context information is used to resolve possible ambiguities and, thus, a much larger degree of ambiguities can be allowed in the initial stage in order to detect all possible candidate positions. Although ContextMap can be used as a stand-alone version using either a genome or transcriptome as input, the version presented in this article is focused on refining initial mappings provided by other mapping algorithms. Evaluation results on simulated sequencing reads showed that the application of ContextMap to either TopHat or MapSplice mappings improved the mapping accuracy of both initial mappings considerably. CONCLUSIONS: In this article, we show that the context of reads mapping to nearby locations provides valuable information for identifying the best unique mapping for a read. Using our method, mappings provided by other state-of-the-art methods can be refined and alignment accuracy can be further improved. AVAILABILITY: http://www.bio.ifi.lmu.de/ContextMap.
format	Online Article Text
id	pubmed-3358662
institution	National Center for Biotechnology Information
language	English
publishDate	2012
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-33586622012-06-07 A context-based approach to identify the most likely mapping for RNA-seq experiments Bonfert, Thomas Csaba, Gergely Zimmer, Ralf Friedel, Caroline C BMC Bioinformatics Proceedings BACKGROUND: Sequencing of mRNA (RNA-seq) by next generation sequencing technologies is widely used for analyzing the transcriptomic state of a cell. Here, one of the main challenges is the mapping of a sequenced read to its transcriptomic origin. As a simple alignment to the genome will fail to identify reads crossing splice junctions and a transcriptome alignment will miss novel splice sites, several approaches have been developed for this purpose. Most of these approaches have two drawbacks. First, each read is assigned to a location independent on whether the corresponding gene is expressed or not, i.e. information from other reads is not taken into account. Second, in case of multiple possible mappings, the mapping with the fewest mismatches is usually chosen which may lead to wrong assignments due to sequencing errors. RESULTS: To address these problems, we developed ContextMap which efficiently uses information on the context of a read, i.e. reads mapping to the same expressed region. The context information is used to resolve possible ambiguities and, thus, a much larger degree of ambiguities can be allowed in the initial stage in order to detect all possible candidate positions. Although ContextMap can be used as a stand-alone version using either a genome or transcriptome as input, the version presented in this article is focused on refining initial mappings provided by other mapping algorithms. Evaluation results on simulated sequencing reads showed that the application of ContextMap to either TopHat or MapSplice mappings improved the mapping accuracy of both initial mappings considerably. CONCLUSIONS: In this article, we show that the context of reads mapping to nearby locations provides valuable information for identifying the best unique mapping for a read. Using our method, mappings provided by other state-of-the-art methods can be refined and alignment accuracy can be further improved. AVAILABILITY: http://www.bio.ifi.lmu.de/ContextMap. BioMed Central 2012-04-19 /pmc/articles/PMC3358662/ /pubmed/22537048 http://dx.doi.org/10.1186/1471-2105-13-S6-S9 Text en Copyright ©2012 Bonfert et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Proceedings Bonfert, Thomas Csaba, Gergely Zimmer, Ralf Friedel, Caroline C A context-based approach to identify the most likely mapping for RNA-seq experiments
title	A context-based approach to identify the most likely mapping for RNA-seq experiments
title_full	A context-based approach to identify the most likely mapping for RNA-seq experiments
title_fullStr	A context-based approach to identify the most likely mapping for RNA-seq experiments
title_full_unstemmed	A context-based approach to identify the most likely mapping for RNA-seq experiments
title_short	A context-based approach to identify the most likely mapping for RNA-seq experiments
title_sort	context-based approach to identify the most likely mapping for rna-seq experiments
topic	Proceedings
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3358662/ https://www.ncbi.nlm.nih.gov/pubmed/22537048 http://dx.doi.org/10.1186/1471-2105-13-S6-S9
work_keys_str_mv	AT bonfertthomas acontextbasedapproachtoidentifythemostlikelymappingforrnaseqexperiments AT csabagergely acontextbasedapproachtoidentifythemostlikelymappingforrnaseqexperiments AT zimmerralf acontextbasedapproachtoidentifythemostlikelymappingforrnaseqexperiments AT friedelcarolinec acontextbasedapproachtoidentifythemostlikelymappingforrnaseqexperiments AT bonfertthomas contextbasedapproachtoidentifythemostlikelymappingforrnaseqexperiments AT csabagergely contextbasedapproachtoidentifythemostlikelymappingforrnaseqexperiments AT zimmerralf contextbasedapproachtoidentifythemostlikelymappingforrnaseqexperiments AT friedelcarolinec contextbasedapproachtoidentifythemostlikelymappingforrnaseqexperiments

A context-based approach to identify the most likely mapping for RNA-seq experiments

Ejemplares similares