Cargando…
De novo chromosome level assembly of a plant genome from long read sequence data
Recent advances in the sequencing and assembly of plant genomes have allowed the generation of genomes with increasing contiguity and sequence accuracy. Chromosome level genome assemblies using sequence contigs generated from long read sequencing have involved the use of proximity analysis (Hi‐C) or...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
John Wiley and Sons Inc.
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9300133/ https://www.ncbi.nlm.nih.gov/pubmed/34784084 http://dx.doi.org/10.1111/tpj.15583 |
_version_ | 1784751141037277184 |
---|---|
author | Sharma, Priyanka Masouleh, Ardashir Kharabian Topp, Bruce Furtado, Agnelo Henry, Robert J. |
author_facet | Sharma, Priyanka Masouleh, Ardashir Kharabian Topp, Bruce Furtado, Agnelo Henry, Robert J. |
author_sort | Sharma, Priyanka |
collection | PubMed |
description | Recent advances in the sequencing and assembly of plant genomes have allowed the generation of genomes with increasing contiguity and sequence accuracy. Chromosome level genome assemblies using sequence contigs generated from long read sequencing have involved the use of proximity analysis (Hi‐C) or traditional genetic maps to guide the placement of sequence contigs within chromosomes. The development of highly accurate long reads by repeated sequencing of circularized DNA (HiFi; PacBio) has greatly increased the size of contigs. We now report the use of HiFiasm to assemble the genome of Macadamia jansenii, a genome that has been used as a model to test sequencing and assembly. This achieved almost complete chromosome level assembly from the sequence data alone without the need for higher level chromosome map information. Eight of the 14 chromosomes were represented by a single large contig (six with telomere repeats at both ends) and the other six assembled from two to four main contigs. The small number of chromosome breaks appears to be the result of highly repetitive regions including ribosomal genes that cannot be assembled by these approaches. De novo assembly of near complete chromosome level plant genomes now appears possible using these sequencing and assembly tools. Further targeted strategies might allow these remaining gaps to be closed. |
format | Online Article Text |
id | pubmed-9300133 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | John Wiley and Sons Inc. |
record_format | MEDLINE/PubMed |
spelling | pubmed-93001332022-07-21 De novo chromosome level assembly of a plant genome from long read sequence data Sharma, Priyanka Masouleh, Ardashir Kharabian Topp, Bruce Furtado, Agnelo Henry, Robert J. Plant J Technical Advance Recent advances in the sequencing and assembly of plant genomes have allowed the generation of genomes with increasing contiguity and sequence accuracy. Chromosome level genome assemblies using sequence contigs generated from long read sequencing have involved the use of proximity analysis (Hi‐C) or traditional genetic maps to guide the placement of sequence contigs within chromosomes. The development of highly accurate long reads by repeated sequencing of circularized DNA (HiFi; PacBio) has greatly increased the size of contigs. We now report the use of HiFiasm to assemble the genome of Macadamia jansenii, a genome that has been used as a model to test sequencing and assembly. This achieved almost complete chromosome level assembly from the sequence data alone without the need for higher level chromosome map information. Eight of the 14 chromosomes were represented by a single large contig (six with telomere repeats at both ends) and the other six assembled from two to four main contigs. The small number of chromosome breaks appears to be the result of highly repetitive regions including ribosomal genes that cannot be assembled by these approaches. De novo assembly of near complete chromosome level plant genomes now appears possible using these sequencing and assembly tools. Further targeted strategies might allow these remaining gaps to be closed. John Wiley and Sons Inc. 2021-12-02 2022-02 /pmc/articles/PMC9300133/ /pubmed/34784084 http://dx.doi.org/10.1111/tpj.15583 Text en © 2021 The Authors. The Plant Journal published by Society for Experimental Biology and John Wiley & Sons Ltd. https://creativecommons.org/licenses/by/4.0/This is an open access article under the terms of the http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Technical Advance Sharma, Priyanka Masouleh, Ardashir Kharabian Topp, Bruce Furtado, Agnelo Henry, Robert J. De novo chromosome level assembly of a plant genome from long read sequence data |
title |
De novo chromosome level assembly of a plant genome from long read sequence data |
title_full |
De novo chromosome level assembly of a plant genome from long read sequence data |
title_fullStr |
De novo chromosome level assembly of a plant genome from long read sequence data |
title_full_unstemmed |
De novo chromosome level assembly of a plant genome from long read sequence data |
title_short |
De novo chromosome level assembly of a plant genome from long read sequence data |
title_sort | de novo chromosome level assembly of a plant genome from long read sequence data |
topic | Technical Advance |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9300133/ https://www.ncbi.nlm.nih.gov/pubmed/34784084 http://dx.doi.org/10.1111/tpj.15583 |
work_keys_str_mv | AT sharmapriyanka denovochromosomelevelassemblyofaplantgenomefromlongreadsequencedata AT masoulehardashirkharabian denovochromosomelevelassemblyofaplantgenomefromlongreadsequencedata AT toppbruce denovochromosomelevelassemblyofaplantgenomefromlongreadsequencedata AT furtadoagnelo denovochromosomelevelassemblyofaplantgenomefromlongreadsequencedata AT henryrobertj denovochromosomelevelassemblyofaplantgenomefromlongreadsequencedata |