Cargando…

De novo chromosome level assembly of a plant genome from long read sequence data

Recent advances in the sequencing and assembly of plant genomes have allowed the generation of genomes with increasing contiguity and sequence accuracy. Chromosome level genome assemblies using sequence contigs generated from long read sequencing have involved the use of proximity analysis (Hi‐C) or...

Descripción completa

Detalles Bibliográficos
Autores principales: Sharma, Priyanka, Masouleh, Ardashir Kharabian, Topp, Bruce, Furtado, Agnelo, Henry, Robert J.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: John Wiley and Sons Inc. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9300133/
https://www.ncbi.nlm.nih.gov/pubmed/34784084
http://dx.doi.org/10.1111/tpj.15583
_version_ 1784751141037277184
author Sharma, Priyanka
Masouleh, Ardashir Kharabian
Topp, Bruce
Furtado, Agnelo
Henry, Robert J.
author_facet Sharma, Priyanka
Masouleh, Ardashir Kharabian
Topp, Bruce
Furtado, Agnelo
Henry, Robert J.
author_sort Sharma, Priyanka
collection PubMed
description Recent advances in the sequencing and assembly of plant genomes have allowed the generation of genomes with increasing contiguity and sequence accuracy. Chromosome level genome assemblies using sequence contigs generated from long read sequencing have involved the use of proximity analysis (Hi‐C) or traditional genetic maps to guide the placement of sequence contigs within chromosomes. The development of highly accurate long reads by repeated sequencing of circularized DNA (HiFi; PacBio) has greatly increased the size of contigs. We now report the use of HiFiasm to assemble the genome of Macadamia jansenii, a genome that has been used as a model to test sequencing and assembly. This achieved almost complete chromosome level assembly from the sequence data alone without the need for higher level chromosome map information. Eight of the 14 chromosomes were represented by a single large contig (six with telomere repeats at both ends) and the other six assembled from two to four main contigs. The small number of chromosome breaks appears to be the result of highly repetitive regions including ribosomal genes that cannot be assembled by these approaches. De novo assembly of near complete chromosome level plant genomes now appears possible using these sequencing and assembly tools. Further targeted strategies might allow these remaining gaps to be closed.
format Online
Article
Text
id pubmed-9300133
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher John Wiley and Sons Inc.
record_format MEDLINE/PubMed
spelling pubmed-93001332022-07-21 De novo chromosome level assembly of a plant genome from long read sequence data Sharma, Priyanka Masouleh, Ardashir Kharabian Topp, Bruce Furtado, Agnelo Henry, Robert J. Plant J Technical Advance Recent advances in the sequencing and assembly of plant genomes have allowed the generation of genomes with increasing contiguity and sequence accuracy. Chromosome level genome assemblies using sequence contigs generated from long read sequencing have involved the use of proximity analysis (Hi‐C) or traditional genetic maps to guide the placement of sequence contigs within chromosomes. The development of highly accurate long reads by repeated sequencing of circularized DNA (HiFi; PacBio) has greatly increased the size of contigs. We now report the use of HiFiasm to assemble the genome of Macadamia jansenii, a genome that has been used as a model to test sequencing and assembly. This achieved almost complete chromosome level assembly from the sequence data alone without the need for higher level chromosome map information. Eight of the 14 chromosomes were represented by a single large contig (six with telomere repeats at both ends) and the other six assembled from two to four main contigs. The small number of chromosome breaks appears to be the result of highly repetitive regions including ribosomal genes that cannot be assembled by these approaches. De novo assembly of near complete chromosome level plant genomes now appears possible using these sequencing and assembly tools. Further targeted strategies might allow these remaining gaps to be closed. John Wiley and Sons Inc. 2021-12-02 2022-02 /pmc/articles/PMC9300133/ /pubmed/34784084 http://dx.doi.org/10.1111/tpj.15583 Text en © 2021 The Authors. The Plant Journal published by Society for Experimental Biology and John Wiley & Sons Ltd. https://creativecommons.org/licenses/by/4.0/This is an open access article under the terms of the http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.
spellingShingle Technical Advance
Sharma, Priyanka
Masouleh, Ardashir Kharabian
Topp, Bruce
Furtado, Agnelo
Henry, Robert J.
De novo chromosome level assembly of a plant genome from long read sequence data
title De novo chromosome level assembly of a plant genome from long read sequence data
title_full De novo chromosome level assembly of a plant genome from long read sequence data
title_fullStr De novo chromosome level assembly of a plant genome from long read sequence data
title_full_unstemmed De novo chromosome level assembly of a plant genome from long read sequence data
title_short De novo chromosome level assembly of a plant genome from long read sequence data
title_sort de novo chromosome level assembly of a plant genome from long read sequence data
topic Technical Advance
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9300133/
https://www.ncbi.nlm.nih.gov/pubmed/34784084
http://dx.doi.org/10.1111/tpj.15583
work_keys_str_mv AT sharmapriyanka denovochromosomelevelassemblyofaplantgenomefromlongreadsequencedata
AT masoulehardashirkharabian denovochromosomelevelassemblyofaplantgenomefromlongreadsequencedata
AT toppbruce denovochromosomelevelassemblyofaplantgenomefromlongreadsequencedata
AT furtadoagnelo denovochromosomelevelassemblyofaplantgenomefromlongreadsequencedata
AT henryrobertj denovochromosomelevelassemblyofaplantgenomefromlongreadsequencedata