Cargando…

Integrating Hi-C links with assembly graphs for chromosome-scale assembly

Long-read sequencing and novel long-range assays have revolutionized de novo genome assembly by automating the reconstruction of reference-quality genomes. In particular, Hi-C sequencing is becoming an economical method for generating chromosome-scale scaffolds. Despite its increasing popularity, th...

Descripción completa

Detalles Bibliográficos
Autores principales: Ghurye, Jay, Rhie, Arang, Walenz, Brian P., Schmitt, Anthony, Selvaraj, Siddarth, Pop, Mihai, Phillippy, Adam M., Koren, Sergey
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6719893/
https://www.ncbi.nlm.nih.gov/pubmed/31433799
http://dx.doi.org/10.1371/journal.pcbi.1007273
_version_ 1783448004387143680
author Ghurye, Jay
Rhie, Arang
Walenz, Brian P.
Schmitt, Anthony
Selvaraj, Siddarth
Pop, Mihai
Phillippy, Adam M.
Koren, Sergey
author_facet Ghurye, Jay
Rhie, Arang
Walenz, Brian P.
Schmitt, Anthony
Selvaraj, Siddarth
Pop, Mihai
Phillippy, Adam M.
Koren, Sergey
author_sort Ghurye, Jay
collection PubMed
description Long-read sequencing and novel long-range assays have revolutionized de novo genome assembly by automating the reconstruction of reference-quality genomes. In particular, Hi-C sequencing is becoming an economical method for generating chromosome-scale scaffolds. Despite its increasing popularity, there are limited open-source tools available. Errors, particularly inversions and fusions across chromosomes, remain higher than alternate scaffolding technologies. We present a novel open-source Hi-C scaffolder that does not require an a priori estimate of chromosome number and minimizes errors by scaffolding with the assistance of an assembly graph. We demonstrate higher accuracy than the state-of-the-art methods across a variety of Hi-C library preparations and input assembly sizes. The Python and C++ code for our method is openly available at https://github.com/machinegun/SALSA.
format Online
Article
Text
id pubmed-6719893
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-67198932019-09-10 Integrating Hi-C links with assembly graphs for chromosome-scale assembly Ghurye, Jay Rhie, Arang Walenz, Brian P. Schmitt, Anthony Selvaraj, Siddarth Pop, Mihai Phillippy, Adam M. Koren, Sergey PLoS Comput Biol Research Article Long-read sequencing and novel long-range assays have revolutionized de novo genome assembly by automating the reconstruction of reference-quality genomes. In particular, Hi-C sequencing is becoming an economical method for generating chromosome-scale scaffolds. Despite its increasing popularity, there are limited open-source tools available. Errors, particularly inversions and fusions across chromosomes, remain higher than alternate scaffolding technologies. We present a novel open-source Hi-C scaffolder that does not require an a priori estimate of chromosome number and minimizes errors by scaffolding with the assistance of an assembly graph. We demonstrate higher accuracy than the state-of-the-art methods across a variety of Hi-C library preparations and input assembly sizes. The Python and C++ code for our method is openly available at https://github.com/machinegun/SALSA. Public Library of Science 2019-08-21 /pmc/articles/PMC6719893/ /pubmed/31433799 http://dx.doi.org/10.1371/journal.pcbi.1007273 Text en https://creativecommons.org/publicdomain/zero/1.0/ This is an open access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 (https://creativecommons.org/publicdomain/zero/1.0/) public domain dedication.
spellingShingle Research Article
Ghurye, Jay
Rhie, Arang
Walenz, Brian P.
Schmitt, Anthony
Selvaraj, Siddarth
Pop, Mihai
Phillippy, Adam M.
Koren, Sergey
Integrating Hi-C links with assembly graphs for chromosome-scale assembly
title Integrating Hi-C links with assembly graphs for chromosome-scale assembly
title_full Integrating Hi-C links with assembly graphs for chromosome-scale assembly
title_fullStr Integrating Hi-C links with assembly graphs for chromosome-scale assembly
title_full_unstemmed Integrating Hi-C links with assembly graphs for chromosome-scale assembly
title_short Integrating Hi-C links with assembly graphs for chromosome-scale assembly
title_sort integrating hi-c links with assembly graphs for chromosome-scale assembly
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6719893/
https://www.ncbi.nlm.nih.gov/pubmed/31433799
http://dx.doi.org/10.1371/journal.pcbi.1007273
work_keys_str_mv AT ghuryejay integratinghiclinkswithassemblygraphsforchromosomescaleassembly
AT rhiearang integratinghiclinkswithassemblygraphsforchromosomescaleassembly
AT walenzbrianp integratinghiclinkswithassemblygraphsforchromosomescaleassembly
AT schmittanthony integratinghiclinkswithassemblygraphsforchromosomescaleassembly
AT selvarajsiddarth integratinghiclinkswithassemblygraphsforchromosomescaleassembly
AT popmihai integratinghiclinkswithassemblygraphsforchromosomescaleassembly
AT phillippyadamm integratinghiclinkswithassemblygraphsforchromosomescaleassembly
AT korensergey integratinghiclinkswithassemblygraphsforchromosomescaleassembly