Cargando…

Improving Illumina assemblies with Hi‐C and long reads: An example with the North African dromedary

Researchers have assembled thousands of eukaryotic genomes using Illumina reads, but traditional mate‐pair libraries cannot span all repetitive elements, resulting in highly fragmented assemblies. However, both chromosome conformation capture techniques, such as Hi‐C and Dovetail Genomics Chicago li...

Descripción completa

Detalles Bibliográficos
Autores principales: Elbers, Jean P., Rogers, Mark F., Perelman, Polina L., Proskuryakova, Anastasia A., Serdyukova, Natalia A., Johnson, Warren E., Horin, Petr, Corander, Jukka, Murphy, David, Burger, Pamela A.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: John Wiley and Sons Inc. 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6618069/
https://www.ncbi.nlm.nih.gov/pubmed/30972949
http://dx.doi.org/10.1111/1755-0998.13020
_version_ 1783433835769233408
author Elbers, Jean P.
Rogers, Mark F.
Perelman, Polina L.
Proskuryakova, Anastasia A.
Serdyukova, Natalia A.
Johnson, Warren E.
Horin, Petr
Corander, Jukka
Murphy, David
Burger, Pamela A.
author_facet Elbers, Jean P.
Rogers, Mark F.
Perelman, Polina L.
Proskuryakova, Anastasia A.
Serdyukova, Natalia A.
Johnson, Warren E.
Horin, Petr
Corander, Jukka
Murphy, David
Burger, Pamela A.
author_sort Elbers, Jean P.
collection PubMed
description Researchers have assembled thousands of eukaryotic genomes using Illumina reads, but traditional mate‐pair libraries cannot span all repetitive elements, resulting in highly fragmented assemblies. However, both chromosome conformation capture techniques, such as Hi‐C and Dovetail Genomics Chicago libraries and long‐read sequencing, such as Pacific Biosciences and Oxford Nanopore, help span and resolve repetitive regions and therefore improve genome assemblies. One important livestock species of arid regions that does not have a high‐quality contiguous reference genome is the dromedary (Camelus dromedarius). Draft genomes exist but are highly fragmented, and a high‐quality reference genome is needed to understand adaptation to desert environments and artificial selection during domestication. Dromedaries are among the last livestock species to have been domesticated, and together with wild and domestic Bactrian camels, they are the only representatives of the Camelini tribe, which highlights their evolutionary significance. Here we describe our efforts to improve the North African dromedary genome. We used Chicago and Hi‐C sequencing libraries from Dovetail Genomics to resolve the order of previously assembled contigs, producing almost chromosome‐level scaffolds. Remaining gaps were filled with Pacific Biosciences long reads, and then scaffolds were comparatively mapped to chromosomes. Long reads added 99.32 Mbp to the total length of the new assembly. Dovetail Chicago and Hi‐C libraries increased the longest scaffold over 12‐fold, from 9.71 Mbp to 124.99 Mbp and the scaffold N50 over 50‐fold, from 1.48 Mbp to 75.02 Mbp. We demonstrate that Illumina de novo assemblies can be substantially upgraded by combining chromosome conformation capture and long‐read sequencing.
format Online
Article
Text
id pubmed-6618069
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher John Wiley and Sons Inc.
record_format MEDLINE/PubMed
spelling pubmed-66180692019-07-22 Improving Illumina assemblies with Hi‐C and long reads: An example with the North African dromedary Elbers, Jean P. Rogers, Mark F. Perelman, Polina L. Proskuryakova, Anastasia A. Serdyukova, Natalia A. Johnson, Warren E. Horin, Petr Corander, Jukka Murphy, David Burger, Pamela A. Mol Ecol Resour RESOURCE ARTICLES Researchers have assembled thousands of eukaryotic genomes using Illumina reads, but traditional mate‐pair libraries cannot span all repetitive elements, resulting in highly fragmented assemblies. However, both chromosome conformation capture techniques, such as Hi‐C and Dovetail Genomics Chicago libraries and long‐read sequencing, such as Pacific Biosciences and Oxford Nanopore, help span and resolve repetitive regions and therefore improve genome assemblies. One important livestock species of arid regions that does not have a high‐quality contiguous reference genome is the dromedary (Camelus dromedarius). Draft genomes exist but are highly fragmented, and a high‐quality reference genome is needed to understand adaptation to desert environments and artificial selection during domestication. Dromedaries are among the last livestock species to have been domesticated, and together with wild and domestic Bactrian camels, they are the only representatives of the Camelini tribe, which highlights their evolutionary significance. Here we describe our efforts to improve the North African dromedary genome. We used Chicago and Hi‐C sequencing libraries from Dovetail Genomics to resolve the order of previously assembled contigs, producing almost chromosome‐level scaffolds. Remaining gaps were filled with Pacific Biosciences long reads, and then scaffolds were comparatively mapped to chromosomes. Long reads added 99.32 Mbp to the total length of the new assembly. Dovetail Chicago and Hi‐C libraries increased the longest scaffold over 12‐fold, from 9.71 Mbp to 124.99 Mbp and the scaffold N50 over 50‐fold, from 1.48 Mbp to 75.02 Mbp. We demonstrate that Illumina de novo assemblies can be substantially upgraded by combining chromosome conformation capture and long‐read sequencing. John Wiley and Sons Inc. 2019-05-17 2019-07 /pmc/articles/PMC6618069/ /pubmed/30972949 http://dx.doi.org/10.1111/1755-0998.13020 Text en © 2019 The Authors. Molecular Ecology Resources Published by John Wiley & Sons Ltd This is an open access article under the terms of the http://creativecommons.org/licenses/by/4.0/ License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.
spellingShingle RESOURCE ARTICLES
Elbers, Jean P.
Rogers, Mark F.
Perelman, Polina L.
Proskuryakova, Anastasia A.
Serdyukova, Natalia A.
Johnson, Warren E.
Horin, Petr
Corander, Jukka
Murphy, David
Burger, Pamela A.
Improving Illumina assemblies with Hi‐C and long reads: An example with the North African dromedary
title Improving Illumina assemblies with Hi‐C and long reads: An example with the North African dromedary
title_full Improving Illumina assemblies with Hi‐C and long reads: An example with the North African dromedary
title_fullStr Improving Illumina assemblies with Hi‐C and long reads: An example with the North African dromedary
title_full_unstemmed Improving Illumina assemblies with Hi‐C and long reads: An example with the North African dromedary
title_short Improving Illumina assemblies with Hi‐C and long reads: An example with the North African dromedary
title_sort improving illumina assemblies with hi‐c and long reads: an example with the north african dromedary
topic RESOURCE ARTICLES
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6618069/
https://www.ncbi.nlm.nih.gov/pubmed/30972949
http://dx.doi.org/10.1111/1755-0998.13020
work_keys_str_mv AT elbersjeanp improvingilluminaassemblieswithhicandlongreadsanexamplewiththenorthafricandromedary
AT rogersmarkf improvingilluminaassemblieswithhicandlongreadsanexamplewiththenorthafricandromedary
AT perelmanpolinal improvingilluminaassemblieswithhicandlongreadsanexamplewiththenorthafricandromedary
AT proskuryakovaanastasiaa improvingilluminaassemblieswithhicandlongreadsanexamplewiththenorthafricandromedary
AT serdyukovanataliaa improvingilluminaassemblieswithhicandlongreadsanexamplewiththenorthafricandromedary
AT johnsonwarrene improvingilluminaassemblieswithhicandlongreadsanexamplewiththenorthafricandromedary
AT horinpetr improvingilluminaassemblieswithhicandlongreadsanexamplewiththenorthafricandromedary
AT coranderjukka improvingilluminaassemblieswithhicandlongreadsanexamplewiththenorthafricandromedary
AT murphydavid improvingilluminaassemblieswithhicandlongreadsanexamplewiththenorthafricandromedary
AT burgerpamelaa improvingilluminaassemblieswithhicandlongreadsanexamplewiththenorthafricandromedary