Cargando…

InsertionMapper: a pipeline tool for the identification of targeted sequences from multidimensional high throughput sequencing data

BACKGROUND: The advent of next-generation high-throughput technologies has revolutionized whole genome sequencing, yet some experiments require sequencing only of targeted regions of the genome from a very large number of samples. These regions can be amplified by PCR and sequenced by next-generatio...

Descripción completa

Detalles Bibliográficos
Autores principales: Xiong, Wenwei, He, Limei, Li, Yubin, Dooner, Hugo K, Du, Chunguang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3850689/
https://www.ncbi.nlm.nih.gov/pubmed/24090499
http://dx.doi.org/10.1186/1471-2164-14-679
_version_ 1782294144007274496
author Xiong, Wenwei
He, Limei
Li, Yubin
Dooner, Hugo K
Du, Chunguang
author_facet Xiong, Wenwei
He, Limei
Li, Yubin
Dooner, Hugo K
Du, Chunguang
author_sort Xiong, Wenwei
collection PubMed
description BACKGROUND: The advent of next-generation high-throughput technologies has revolutionized whole genome sequencing, yet some experiments require sequencing only of targeted regions of the genome from a very large number of samples. These regions can be amplified by PCR and sequenced by next-generation methods using a multidimensional pooling strategy. However, there is at present no available generalized tool for the computational analysis of target-enriched NGS data from multidimensional pools. RESULTS: Here we present InsertionMapper, a pipeline tool for the identification of targeted sequences from multidimensional high throughput sequencing data. InsertionMapper consists of four independently working modules: Data Preprocessing, Database Modeling, Dimension Deconvolution and Element Mapping. We illustrate InsertionMapper with an example from our project 'New reverse genetics resources for maize’, which aims to sequence-index a collection of 15,000 independent insertion sites of the transposon Ds in maize. Identified sequences are validated by PCR assays. This pipeline tool is applicable to similar scenarios requiring analysis of the tremendous output of short reads produced in NGS sequencing experiments of targeted genome sequences. CONCLUSIONS: InsertionMapper is proven efficacious for the identification of target-enriched sequences from multidimensional high throughput sequencing data. With adjustable parameters and experiment configurations, this tool can save great computational effort to biologists interested in identifying their sequences of interest within the huge output of modern DNA sequencers. InsertionMapper is freely accessible at https://sourceforge.net/p/insertionmapper and http://bo.csam.montclair.edu/du/insertionmapper.
format Online
Article
Text
id pubmed-3850689
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-38506892013-12-05 InsertionMapper: a pipeline tool for the identification of targeted sequences from multidimensional high throughput sequencing data Xiong, Wenwei He, Limei Li, Yubin Dooner, Hugo K Du, Chunguang BMC Genomics Software BACKGROUND: The advent of next-generation high-throughput technologies has revolutionized whole genome sequencing, yet some experiments require sequencing only of targeted regions of the genome from a very large number of samples. These regions can be amplified by PCR and sequenced by next-generation methods using a multidimensional pooling strategy. However, there is at present no available generalized tool for the computational analysis of target-enriched NGS data from multidimensional pools. RESULTS: Here we present InsertionMapper, a pipeline tool for the identification of targeted sequences from multidimensional high throughput sequencing data. InsertionMapper consists of four independently working modules: Data Preprocessing, Database Modeling, Dimension Deconvolution and Element Mapping. We illustrate InsertionMapper with an example from our project 'New reverse genetics resources for maize’, which aims to sequence-index a collection of 15,000 independent insertion sites of the transposon Ds in maize. Identified sequences are validated by PCR assays. This pipeline tool is applicable to similar scenarios requiring analysis of the tremendous output of short reads produced in NGS sequencing experiments of targeted genome sequences. CONCLUSIONS: InsertionMapper is proven efficacious for the identification of target-enriched sequences from multidimensional high throughput sequencing data. With adjustable parameters and experiment configurations, this tool can save great computational effort to biologists interested in identifying their sequences of interest within the huge output of modern DNA sequencers. InsertionMapper is freely accessible at https://sourceforge.net/p/insertionmapper and http://bo.csam.montclair.edu/du/insertionmapper. BioMed Central 2013-10-04 /pmc/articles/PMC3850689/ /pubmed/24090499 http://dx.doi.org/10.1186/1471-2164-14-679 Text en Copyright © 2013 Xiong et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Software
Xiong, Wenwei
He, Limei
Li, Yubin
Dooner, Hugo K
Du, Chunguang
InsertionMapper: a pipeline tool for the identification of targeted sequences from multidimensional high throughput sequencing data
title InsertionMapper: a pipeline tool for the identification of targeted sequences from multidimensional high throughput sequencing data
title_full InsertionMapper: a pipeline tool for the identification of targeted sequences from multidimensional high throughput sequencing data
title_fullStr InsertionMapper: a pipeline tool for the identification of targeted sequences from multidimensional high throughput sequencing data
title_full_unstemmed InsertionMapper: a pipeline tool for the identification of targeted sequences from multidimensional high throughput sequencing data
title_short InsertionMapper: a pipeline tool for the identification of targeted sequences from multidimensional high throughput sequencing data
title_sort insertionmapper: a pipeline tool for the identification of targeted sequences from multidimensional high throughput sequencing data
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3850689/
https://www.ncbi.nlm.nih.gov/pubmed/24090499
http://dx.doi.org/10.1186/1471-2164-14-679
work_keys_str_mv AT xiongwenwei insertionmapperapipelinetoolfortheidentificationoftargetedsequencesfrommultidimensionalhighthroughputsequencingdata
AT helimei insertionmapperapipelinetoolfortheidentificationoftargetedsequencesfrommultidimensionalhighthroughputsequencingdata
AT liyubin insertionmapperapipelinetoolfortheidentificationoftargetedsequencesfrommultidimensionalhighthroughputsequencingdata
AT doonerhugok insertionmapperapipelinetoolfortheidentificationoftargetedsequencesfrommultidimensionalhighthroughputsequencingdata
AT duchunguang insertionmapperapipelinetoolfortheidentificationoftargetedsequencesfrommultidimensionalhighthroughputsequencingdata