Cargando…

Integrating Hi-C links with assembly graphs for chromosome-scale assembly

Long-read sequencing and novel long-range assays have revolutionized de novo genome assembly by automating the reconstruction of reference-quality genomes. In particular, Hi-C sequencing is becoming an economical method for generating chromosome-scale scaffolds. Despite its increasing popularity, th...

Descripción completa

Detalles Bibliográficos
Autores principales: Ghurye, Jay, Rhie, Arang, Walenz, Brian P., Schmitt, Anthony, Selvaraj, Siddarth, Pop, Mihai, Phillippy, Adam M., Koren, Sergey
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6719893/
https://www.ncbi.nlm.nih.gov/pubmed/31433799
http://dx.doi.org/10.1371/journal.pcbi.1007273
Descripción
Sumario:Long-read sequencing and novel long-range assays have revolutionized de novo genome assembly by automating the reconstruction of reference-quality genomes. In particular, Hi-C sequencing is becoming an economical method for generating chromosome-scale scaffolds. Despite its increasing popularity, there are limited open-source tools available. Errors, particularly inversions and fusions across chromosomes, remain higher than alternate scaffolding technologies. We present a novel open-source Hi-C scaffolder that does not require an a priori estimate of chromosome number and minimizes errors by scaffolding with the assistance of an assembly graph. We demonstrate higher accuracy than the state-of-the-art methods across a variety of Hi-C library preparations and input assembly sizes. The Python and C++ code for our method is openly available at https://github.com/machinegun/SALSA.