Cargando…

HASLR: Fast Hybrid Assembly of Long Reads

Third-generation sequencing technologies from companies such as Oxford Nanopore and Pacific Biosciences have paved the way for building more contiguous and potentially gap-free assemblies. The larger effective length of their reads has provided a means to overcome the challenges of short to mid-rang...

Descripción completa

Detalles Bibliográficos
Autores principales: Haghshenas, Ehsan, Asghari, Hossein, Stoye, Jens, Chauve, Cedric, Hach, Faraz
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7419660/
https://www.ncbi.nlm.nih.gov/pubmed/32781410
http://dx.doi.org/10.1016/j.isci.2020.101389
_version_ 1783569931928862720
author Haghshenas, Ehsan
Asghari, Hossein
Stoye, Jens
Chauve, Cedric
Hach, Faraz
author_facet Haghshenas, Ehsan
Asghari, Hossein
Stoye, Jens
Chauve, Cedric
Hach, Faraz
author_sort Haghshenas, Ehsan
collection PubMed
description Third-generation sequencing technologies from companies such as Oxford Nanopore and Pacific Biosciences have paved the way for building more contiguous and potentially gap-free assemblies. The larger effective length of their reads has provided a means to overcome the challenges of short to mid-range repeats. Currently, accurate long read assemblers are computationally expensive, whereas faster methods are not as accurate. Moreover, despite recent advances in third-generation sequencing, researchers still tend to generate accurate short reads for many of the analysis tasks. Here, we present HASLR, a hybrid assembler that uses error-prone long reads together with high-quality short reads to efficiently generate accurate genome assemblies. Our experiments show that HASLR is not only the fastest assembler but also the one with the lowest number of misassemblies on most of the samples, while being on par with other assemblers in terms of contiguity and accuracy.
format Online
Article
Text
id pubmed-7419660
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-74196602020-08-14 HASLR: Fast Hybrid Assembly of Long Reads Haghshenas, Ehsan Asghari, Hossein Stoye, Jens Chauve, Cedric Hach, Faraz iScience Article Third-generation sequencing technologies from companies such as Oxford Nanopore and Pacific Biosciences have paved the way for building more contiguous and potentially gap-free assemblies. The larger effective length of their reads has provided a means to overcome the challenges of short to mid-range repeats. Currently, accurate long read assemblers are computationally expensive, whereas faster methods are not as accurate. Moreover, despite recent advances in third-generation sequencing, researchers still tend to generate accurate short reads for many of the analysis tasks. Here, we present HASLR, a hybrid assembler that uses error-prone long reads together with high-quality short reads to efficiently generate accurate genome assemblies. Our experiments show that HASLR is not only the fastest assembler but also the one with the lowest number of misassemblies on most of the samples, while being on par with other assemblers in terms of contiguity and accuracy. Elsevier 2020-07-25 /pmc/articles/PMC7419660/ /pubmed/32781410 http://dx.doi.org/10.1016/j.isci.2020.101389 Text en © 2020 The Author(s) http://creativecommons.org/licenses/by-nc-nd/4.0/ This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Article
Haghshenas, Ehsan
Asghari, Hossein
Stoye, Jens
Chauve, Cedric
Hach, Faraz
HASLR: Fast Hybrid Assembly of Long Reads
title HASLR: Fast Hybrid Assembly of Long Reads
title_full HASLR: Fast Hybrid Assembly of Long Reads
title_fullStr HASLR: Fast Hybrid Assembly of Long Reads
title_full_unstemmed HASLR: Fast Hybrid Assembly of Long Reads
title_short HASLR: Fast Hybrid Assembly of Long Reads
title_sort haslr: fast hybrid assembly of long reads
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7419660/
https://www.ncbi.nlm.nih.gov/pubmed/32781410
http://dx.doi.org/10.1016/j.isci.2020.101389
work_keys_str_mv AT haghshenasehsan haslrfasthybridassemblyoflongreads
AT asgharihossein haslrfasthybridassemblyoflongreads
AT stoyejens haslrfasthybridassemblyoflongreads
AT chauvecedric haslrfasthybridassemblyoflongreads
AT hachfaraz haslrfasthybridassemblyoflongreads