Cargando…
JASPER: A fast genome polishing tool that improves accuracy of genome assemblies
Advances in long-read sequencing technologies have dramatically improved the contiguity and completeness of genome assemblies. Using the latest nanopore-based sequencers, we can generate enough data for the assembly of a human genome from a single flow cell. With the long-read data from these sequen...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10096238/ https://www.ncbi.nlm.nih.gov/pubmed/37000853 http://dx.doi.org/10.1371/journal.pcbi.1011032 |
_version_ | 1785024285823205376 |
---|---|
author | Guo, Alina Salzberg, Steven L. Zimin, Aleksey V. |
author_facet | Guo, Alina Salzberg, Steven L. Zimin, Aleksey V. |
author_sort | Guo, Alina |
collection | PubMed |
description | Advances in long-read sequencing technologies have dramatically improved the contiguity and completeness of genome assemblies. Using the latest nanopore-based sequencers, we can generate enough data for the assembly of a human genome from a single flow cell. With the long-read data from these sequences, we can now routinely produce de novo genome assemblies in which half or more of a genome is contained in megabase-scale contigs. Assemblies produced from nanopore data alone, though, have relatively high error rates and can benefit from a process called polishing, in which more-accurate reads are used to correct errors in the consensus sequence. In this manuscript, we present a novel tool for genome polishing called JASPER (Jellyfish-based Assembly Sequence Polisher for Error Reduction). In contrast to many other polishing methods, JASPER gains efficiency by avoiding the alignment of reads to the assembly. Instead, JASPER uses a database of k-mer counts that it creates from the reads to detect and correct errors in the consensus. Our experiments demonstrate that JASPER is faster than alignment-based polishers, and both faster and more accurate than other k-mer based polishing methods. We also introduce the idea of using a polishing tool to create population-specific reference genomes, and illustrate this idea using sequence data from multiple individuals from Tokyo, Japan. |
format | Online Article Text |
id | pubmed-10096238 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-100962382023-04-13 JASPER: A fast genome polishing tool that improves accuracy of genome assemblies Guo, Alina Salzberg, Steven L. Zimin, Aleksey V. PLoS Comput Biol Research Article Advances in long-read sequencing technologies have dramatically improved the contiguity and completeness of genome assemblies. Using the latest nanopore-based sequencers, we can generate enough data for the assembly of a human genome from a single flow cell. With the long-read data from these sequences, we can now routinely produce de novo genome assemblies in which half or more of a genome is contained in megabase-scale contigs. Assemblies produced from nanopore data alone, though, have relatively high error rates and can benefit from a process called polishing, in which more-accurate reads are used to correct errors in the consensus sequence. In this manuscript, we present a novel tool for genome polishing called JASPER (Jellyfish-based Assembly Sequence Polisher for Error Reduction). In contrast to many other polishing methods, JASPER gains efficiency by avoiding the alignment of reads to the assembly. Instead, JASPER uses a database of k-mer counts that it creates from the reads to detect and correct errors in the consensus. Our experiments demonstrate that JASPER is faster than alignment-based polishers, and both faster and more accurate than other k-mer based polishing methods. We also introduce the idea of using a polishing tool to create population-specific reference genomes, and illustrate this idea using sequence data from multiple individuals from Tokyo, Japan. Public Library of Science 2023-03-31 /pmc/articles/PMC10096238/ /pubmed/37000853 http://dx.doi.org/10.1371/journal.pcbi.1011032 Text en © 2023 Guo et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Guo, Alina Salzberg, Steven L. Zimin, Aleksey V. JASPER: A fast genome polishing tool that improves accuracy of genome assemblies |
title | JASPER: A fast genome polishing tool that improves accuracy of genome assemblies |
title_full | JASPER: A fast genome polishing tool that improves accuracy of genome assemblies |
title_fullStr | JASPER: A fast genome polishing tool that improves accuracy of genome assemblies |
title_full_unstemmed | JASPER: A fast genome polishing tool that improves accuracy of genome assemblies |
title_short | JASPER: A fast genome polishing tool that improves accuracy of genome assemblies |
title_sort | jasper: a fast genome polishing tool that improves accuracy of genome assemblies |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10096238/ https://www.ncbi.nlm.nih.gov/pubmed/37000853 http://dx.doi.org/10.1371/journal.pcbi.1011032 |
work_keys_str_mv | AT guoalina jasperafastgenomepolishingtoolthatimprovesaccuracyofgenomeassemblies AT salzbergstevenl jasperafastgenomepolishingtoolthatimprovesaccuracyofgenomeassemblies AT ziminalekseyv jasperafastgenomepolishingtoolthatimprovesaccuracyofgenomeassemblies |