Cargando…

Performance of genetic programming optimised Bowtie2 on genome comparison and analytic testing (GCAT) benchmarks

BACKGROUND: Genetic studies are increasingly based on short noisy next generation scanners. Typically complete DNA sequences are assembled by matching short NextGen sequences against reference genomes. Despite considerable algorithmic gains since the turn of the millennium, matching both single ende...

Descripción completa

Detalles Bibliográficos
Autor principal:	Langdon, W B
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2015
Materias:	Short Report
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4304608/ https://www.ncbi.nlm.nih.gov/pubmed/25621011 http://dx.doi.org/10.1186/s13040-014-0034-0

_version_	1782354135194009600
author	Langdon, W B
author_facet	Langdon, W B
author_sort	Langdon, W B
collection	PubMed
description	BACKGROUND: Genetic studies are increasingly based on short noisy next generation scanners. Typically complete DNA sequences are assembled by matching short NextGen sequences against reference genomes. Despite considerable algorithmic gains since the turn of the millennium, matching both single ended and paired end strings to a reference remains computationally demanding. Further tailoring Bioinformatics tools to each new task or scanner remains highly skilled and labour intensive. With this in mind, we recently demonstrated a genetic programming based automated technique which generated a version of the state-of-the-art alignment tool Bowtie2 which was considerably faster on short sequences produced by a scanner at the Broad Institute and released as part of The Thousand Genome Project. RESULTS: Bowtie2 (GP) and the original Bowtie2 release were compared on bioplanet’s GCAT synthetic benchmarks. Bowtie2 (GP) enhancements were also applied to the latest Bowtie2 release (2.2.3, 29 May 2014) and retained both the GP and the manually introduced improvements. CONCLUSIONS: On both singled ended and paired-end synthetic next generation DNA sequence GCAT benchmarks Bowtie2GP runs up to 45% faster than Bowtie2. The lost in accuracy can be as little as 0.2–0.5% but up to 2.5% for longer sequences.
format	Online Article Text
id	pubmed-4304608
institution	National Center for Biotechnology Information
language	English
publishDate	2015
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-43046082015-01-24 Performance of genetic programming optimised Bowtie2 on genome comparison and analytic testing (GCAT) benchmarks Langdon, W B BioData Min Short Report BACKGROUND: Genetic studies are increasingly based on short noisy next generation scanners. Typically complete DNA sequences are assembled by matching short NextGen sequences against reference genomes. Despite considerable algorithmic gains since the turn of the millennium, matching both single ended and paired end strings to a reference remains computationally demanding. Further tailoring Bioinformatics tools to each new task or scanner remains highly skilled and labour intensive. With this in mind, we recently demonstrated a genetic programming based automated technique which generated a version of the state-of-the-art alignment tool Bowtie2 which was considerably faster on short sequences produced by a scanner at the Broad Institute and released as part of The Thousand Genome Project. RESULTS: Bowtie2 (GP) and the original Bowtie2 release were compared on bioplanet’s GCAT synthetic benchmarks. Bowtie2 (GP) enhancements were also applied to the latest Bowtie2 release (2.2.3, 29 May 2014) and retained both the GP and the manually introduced improvements. CONCLUSIONS: On both singled ended and paired-end synthetic next generation DNA sequence GCAT benchmarks Bowtie2GP runs up to 45% faster than Bowtie2. The lost in accuracy can be as little as 0.2–0.5% but up to 2.5% for longer sequences. BioMed Central 2015-01-08 /pmc/articles/PMC4304608/ /pubmed/25621011 http://dx.doi.org/10.1186/s13040-014-0034-0 Text en © Langdon; licensee BioMed Central. 2015 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle	Short Report Langdon, W B Performance of genetic programming optimised Bowtie2 on genome comparison and analytic testing (GCAT) benchmarks
title	Performance of genetic programming optimised Bowtie2 on genome comparison and analytic testing (GCAT) benchmarks
title_full	Performance of genetic programming optimised Bowtie2 on genome comparison and analytic testing (GCAT) benchmarks
title_fullStr	Performance of genetic programming optimised Bowtie2 on genome comparison and analytic testing (GCAT) benchmarks
title_full_unstemmed	Performance of genetic programming optimised Bowtie2 on genome comparison and analytic testing (GCAT) benchmarks
title_short	Performance of genetic programming optimised Bowtie2 on genome comparison and analytic testing (GCAT) benchmarks
title_sort	performance of genetic programming optimised bowtie2 on genome comparison and analytic testing (gcat) benchmarks
topic	Short Report
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4304608/ https://www.ncbi.nlm.nih.gov/pubmed/25621011 http://dx.doi.org/10.1186/s13040-014-0034-0
work_keys_str_mv	AT langdonwb performanceofgeneticprogrammingoptimisedbowtie2ongenomecomparisonandanalytictestinggcatbenchmarks

Performance of genetic programming optimised Bowtie2 on genome comparison and analytic testing (GCAT) benchmarks

Ejemplares similares