Cargando…

Empirical evaluation of methods for de novo genome assembly

Technologies for next-generation sequencing (NGS) have stimulated an exponential rise in high-throughput sequencing projects and resulted in the development of new read-assembly algorithms. A drastic reduction in the costs of generating short reads on the genomes of new organisms is attributable to...

Descripción completa

Detalles Bibliográficos
Autores principales: Dida, Firaol, Yi, Gangman
Formato: Online Artículo Texto
Lenguaje:English
Publicado: PeerJ Inc. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8279138/
https://www.ncbi.nlm.nih.gov/pubmed/34307867
http://dx.doi.org/10.7717/peerj-cs.636
_version_ 1783722395592294400
author Dida, Firaol
Yi, Gangman
author_facet Dida, Firaol
Yi, Gangman
author_sort Dida, Firaol
collection PubMed
description Technologies for next-generation sequencing (NGS) have stimulated an exponential rise in high-throughput sequencing projects and resulted in the development of new read-assembly algorithms. A drastic reduction in the costs of generating short reads on the genomes of new organisms is attributable to recent advances in NGS technologies such as Ion Torrent, Illumina, and PacBio. Genome research has led to the creation of high-quality reference genomes for several organisms, and de novo assembly is a key initiative that has facilitated gene discovery and other studies. More powerful analytical algorithms are needed to work on the increasing amount of sequence data. We make a thorough comparison of the de novo assembly algorithms to allow new users to clearly understand the assembly algorithms: overlap-layout-consensus and de-Bruijn-graph, string-graph based assembly, and hybrid approach. We also address the computational efficacy of each algorithm’s performance, challenges faced by the assem- bly tools used, and the impact of repeats. Our results compare the relative performance of the different assemblers and other related assembly differences with and without the reference genome. We hope that this analysis will contribute to further the application of de novo sequences and help the future growth of assembly algorithms.
format Online
Article
Text
id pubmed-8279138
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher PeerJ Inc.
record_format MEDLINE/PubMed
spelling pubmed-82791382021-07-22 Empirical evaluation of methods for de novo genome assembly Dida, Firaol Yi, Gangman PeerJ Comput Sci Bioinformatics Technologies for next-generation sequencing (NGS) have stimulated an exponential rise in high-throughput sequencing projects and resulted in the development of new read-assembly algorithms. A drastic reduction in the costs of generating short reads on the genomes of new organisms is attributable to recent advances in NGS technologies such as Ion Torrent, Illumina, and PacBio. Genome research has led to the creation of high-quality reference genomes for several organisms, and de novo assembly is a key initiative that has facilitated gene discovery and other studies. More powerful analytical algorithms are needed to work on the increasing amount of sequence data. We make a thorough comparison of the de novo assembly algorithms to allow new users to clearly understand the assembly algorithms: overlap-layout-consensus and de-Bruijn-graph, string-graph based assembly, and hybrid approach. We also address the computational efficacy of each algorithm’s performance, challenges faced by the assem- bly tools used, and the impact of repeats. Our results compare the relative performance of the different assemblers and other related assembly differences with and without the reference genome. We hope that this analysis will contribute to further the application of de novo sequences and help the future growth of assembly algorithms. PeerJ Inc. 2021-07-09 /pmc/articles/PMC8279138/ /pubmed/34307867 http://dx.doi.org/10.7717/peerj-cs.636 Text en ©2021 Dida and Yi https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Computer Science) and either DOI or URL of the article must be cited.
spellingShingle Bioinformatics
Dida, Firaol
Yi, Gangman
Empirical evaluation of methods for de novo genome assembly
title Empirical evaluation of methods for de novo genome assembly
title_full Empirical evaluation of methods for de novo genome assembly
title_fullStr Empirical evaluation of methods for de novo genome assembly
title_full_unstemmed Empirical evaluation of methods for de novo genome assembly
title_short Empirical evaluation of methods for de novo genome assembly
title_sort empirical evaluation of methods for de novo genome assembly
topic Bioinformatics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8279138/
https://www.ncbi.nlm.nih.gov/pubmed/34307867
http://dx.doi.org/10.7717/peerj-cs.636
work_keys_str_mv AT didafiraol empiricalevaluationofmethodsfordenovogenomeassembly
AT yigangman empiricalevaluationofmethodsfordenovogenomeassembly