Cargando…

Evaluation of Methods for De Novo Genome Assembly from High-Throughput Sequencing Reads Reveals Dependencies That Affect the Quality of the Results

Recent developments in high-throughput sequencing technology have made low-cost sequencing an attractive approach for many genome analysis tasks. Increasing read lengths, improving quality and the production of increasingly larger numbers of usable sequences per instrument-run continue to make whole...

Descripción completa

Detalles Bibliográficos
Autores principales: Haiminen, Niina, Kuhn, David N., Parida, Laxmi, Rigoutsos, Isidore
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3168497/
https://www.ncbi.nlm.nih.gov/pubmed/21915294
http://dx.doi.org/10.1371/journal.pone.0024182
_version_ 1782211405895696384
author Haiminen, Niina
Kuhn, David N.
Parida, Laxmi
Rigoutsos, Isidore
author_facet Haiminen, Niina
Kuhn, David N.
Parida, Laxmi
Rigoutsos, Isidore
author_sort Haiminen, Niina
collection PubMed
description Recent developments in high-throughput sequencing technology have made low-cost sequencing an attractive approach for many genome analysis tasks. Increasing read lengths, improving quality and the production of increasingly larger numbers of usable sequences per instrument-run continue to make whole-genome assembly an appealing target application. In this paper we evaluate the feasibility of de novo genome assembly from short reads (≤100 nucleotides) through a detailed study involving genomic sequences of various lengths and origin, in conjunction with several of the currently popular assembly programs. Our extensive analysis demonstrates that, in addition to sequencing coverage, attributes such as the architecture of the target genome, the identity of the used assembly program, the average read length and the observed sequencing error rates are powerful variables that affect the best achievable assembly of the target sequence in terms of size and correctness.
format Online
Article
Text
id pubmed-3168497
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-31684972011-09-13 Evaluation of Methods for De Novo Genome Assembly from High-Throughput Sequencing Reads Reveals Dependencies That Affect the Quality of the Results Haiminen, Niina Kuhn, David N. Parida, Laxmi Rigoutsos, Isidore PLoS One Research Article Recent developments in high-throughput sequencing technology have made low-cost sequencing an attractive approach for many genome analysis tasks. Increasing read lengths, improving quality and the production of increasingly larger numbers of usable sequences per instrument-run continue to make whole-genome assembly an appealing target application. In this paper we evaluate the feasibility of de novo genome assembly from short reads (≤100 nucleotides) through a detailed study involving genomic sequences of various lengths and origin, in conjunction with several of the currently popular assembly programs. Our extensive analysis demonstrates that, in addition to sequencing coverage, attributes such as the architecture of the target genome, the identity of the used assembly program, the average read length and the observed sequencing error rates are powerful variables that affect the best achievable assembly of the target sequence in terms of size and correctness. Public Library of Science 2011-09-07 /pmc/articles/PMC3168497/ /pubmed/21915294 http://dx.doi.org/10.1371/journal.pone.0024182 Text en Haiminen et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Haiminen, Niina
Kuhn, David N.
Parida, Laxmi
Rigoutsos, Isidore
Evaluation of Methods for De Novo Genome Assembly from High-Throughput Sequencing Reads Reveals Dependencies That Affect the Quality of the Results
title Evaluation of Methods for De Novo Genome Assembly from High-Throughput Sequencing Reads Reveals Dependencies That Affect the Quality of the Results
title_full Evaluation of Methods for De Novo Genome Assembly from High-Throughput Sequencing Reads Reveals Dependencies That Affect the Quality of the Results
title_fullStr Evaluation of Methods for De Novo Genome Assembly from High-Throughput Sequencing Reads Reveals Dependencies That Affect the Quality of the Results
title_full_unstemmed Evaluation of Methods for De Novo Genome Assembly from High-Throughput Sequencing Reads Reveals Dependencies That Affect the Quality of the Results
title_short Evaluation of Methods for De Novo Genome Assembly from High-Throughput Sequencing Reads Reveals Dependencies That Affect the Quality of the Results
title_sort evaluation of methods for de novo genome assembly from high-throughput sequencing reads reveals dependencies that affect the quality of the results
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3168497/
https://www.ncbi.nlm.nih.gov/pubmed/21915294
http://dx.doi.org/10.1371/journal.pone.0024182
work_keys_str_mv AT haiminenniina evaluationofmethodsfordenovogenomeassemblyfromhighthroughputsequencingreadsrevealsdependenciesthataffectthequalityoftheresults
AT kuhndavidn evaluationofmethodsfordenovogenomeassemblyfromhighthroughputsequencingreadsrevealsdependenciesthataffectthequalityoftheresults
AT paridalaxmi evaluationofmethodsfordenovogenomeassemblyfromhighthroughputsequencingreadsrevealsdependenciesthataffectthequalityoftheresults
AT rigoutsosisidore evaluationofmethodsfordenovogenomeassemblyfromhighthroughputsequencingreadsrevealsdependenciesthataffectthequalityoftheresults