Cargando…
Evaluation of Methods for De Novo Genome Assembly from High-Throughput Sequencing Reads Reveals Dependencies That Affect the Quality of the Results
Recent developments in high-throughput sequencing technology have made low-cost sequencing an attractive approach for many genome analysis tasks. Increasing read lengths, improving quality and the production of increasingly larger numbers of usable sequences per instrument-run continue to make whole...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2011
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3168497/ https://www.ncbi.nlm.nih.gov/pubmed/21915294 http://dx.doi.org/10.1371/journal.pone.0024182 |
_version_ | 1782211405895696384 |
---|---|
author | Haiminen, Niina Kuhn, David N. Parida, Laxmi Rigoutsos, Isidore |
author_facet | Haiminen, Niina Kuhn, David N. Parida, Laxmi Rigoutsos, Isidore |
author_sort | Haiminen, Niina |
collection | PubMed |
description | Recent developments in high-throughput sequencing technology have made low-cost sequencing an attractive approach for many genome analysis tasks. Increasing read lengths, improving quality and the production of increasingly larger numbers of usable sequences per instrument-run continue to make whole-genome assembly an appealing target application. In this paper we evaluate the feasibility of de novo genome assembly from short reads (≤100 nucleotides) through a detailed study involving genomic sequences of various lengths and origin, in conjunction with several of the currently popular assembly programs. Our extensive analysis demonstrates that, in addition to sequencing coverage, attributes such as the architecture of the target genome, the identity of the used assembly program, the average read length and the observed sequencing error rates are powerful variables that affect the best achievable assembly of the target sequence in terms of size and correctness. |
format | Online Article Text |
id | pubmed-3168497 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2011 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-31684972011-09-13 Evaluation of Methods for De Novo Genome Assembly from High-Throughput Sequencing Reads Reveals Dependencies That Affect the Quality of the Results Haiminen, Niina Kuhn, David N. Parida, Laxmi Rigoutsos, Isidore PLoS One Research Article Recent developments in high-throughput sequencing technology have made low-cost sequencing an attractive approach for many genome analysis tasks. Increasing read lengths, improving quality and the production of increasingly larger numbers of usable sequences per instrument-run continue to make whole-genome assembly an appealing target application. In this paper we evaluate the feasibility of de novo genome assembly from short reads (≤100 nucleotides) through a detailed study involving genomic sequences of various lengths and origin, in conjunction with several of the currently popular assembly programs. Our extensive analysis demonstrates that, in addition to sequencing coverage, attributes such as the architecture of the target genome, the identity of the used assembly program, the average read length and the observed sequencing error rates are powerful variables that affect the best achievable assembly of the target sequence in terms of size and correctness. Public Library of Science 2011-09-07 /pmc/articles/PMC3168497/ /pubmed/21915294 http://dx.doi.org/10.1371/journal.pone.0024182 Text en Haiminen et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited. |
spellingShingle | Research Article Haiminen, Niina Kuhn, David N. Parida, Laxmi Rigoutsos, Isidore Evaluation of Methods for De Novo Genome Assembly from High-Throughput Sequencing Reads Reveals Dependencies That Affect the Quality of the Results |
title | Evaluation of Methods for De Novo Genome Assembly from High-Throughput Sequencing Reads Reveals Dependencies That Affect the Quality of the Results |
title_full | Evaluation of Methods for De Novo Genome Assembly from High-Throughput Sequencing Reads Reveals Dependencies That Affect the Quality of the Results |
title_fullStr | Evaluation of Methods for De Novo Genome Assembly from High-Throughput Sequencing Reads Reveals Dependencies That Affect the Quality of the Results |
title_full_unstemmed | Evaluation of Methods for De Novo Genome Assembly from High-Throughput Sequencing Reads Reveals Dependencies That Affect the Quality of the Results |
title_short | Evaluation of Methods for De Novo Genome Assembly from High-Throughput Sequencing Reads Reveals Dependencies That Affect the Quality of the Results |
title_sort | evaluation of methods for de novo genome assembly from high-throughput sequencing reads reveals dependencies that affect the quality of the results |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3168497/ https://www.ncbi.nlm.nih.gov/pubmed/21915294 http://dx.doi.org/10.1371/journal.pone.0024182 |
work_keys_str_mv | AT haiminenniina evaluationofmethodsfordenovogenomeassemblyfromhighthroughputsequencingreadsrevealsdependenciesthataffectthequalityoftheresults AT kuhndavidn evaluationofmethodsfordenovogenomeassemblyfromhighthroughputsequencingreadsrevealsdependenciesthataffectthequalityoftheresults AT paridalaxmi evaluationofmethodsfordenovogenomeassemblyfromhighthroughputsequencingreadsrevealsdependenciesthataffectthequalityoftheresults AT rigoutsosisidore evaluationofmethodsfordenovogenomeassemblyfromhighthroughputsequencingreadsrevealsdependenciesthataffectthequalityoftheresults |