Cargando…

Genome assembly of the popular Korean soybean cultivar Hwangkeum

Massive resequencing efforts have been undertaken to catalog allelic variants in major crop species including soybean, but the scope of the information for genetic variation often depends on short sequence reads mapped to the extant reference genome. Additional de novo assembled genome sequences pro...

Descripción completa

Detalles Bibliográficos
Autores principales: Kim, Myung-Shin, Lee, Taeyoung, Baek, Jeonghun, Kim, Ji Hong, Kim, Changhoon, Jeong, Soon-Chun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8496230/
https://www.ncbi.nlm.nih.gov/pubmed/34568925
http://dx.doi.org/10.1093/g3journal/jkab272
_version_ 1784579711638175744
author Kim, Myung-Shin
Lee, Taeyoung
Baek, Jeonghun
Kim, Ji Hong
Kim, Changhoon
Jeong, Soon-Chun
author_facet Kim, Myung-Shin
Lee, Taeyoung
Baek, Jeonghun
Kim, Ji Hong
Kim, Changhoon
Jeong, Soon-Chun
author_sort Kim, Myung-Shin
collection PubMed
description Massive resequencing efforts have been undertaken to catalog allelic variants in major crop species including soybean, but the scope of the information for genetic variation often depends on short sequence reads mapped to the extant reference genome. Additional de novo assembled genome sequences provide a unique opportunity to explore a dispensable genome fraction in the pan-genome of a species. Here, we report the de novo assembly and annotation of Hwangkeum, a popular soybean cultivar in Korea. The assembly was constructed using PromethION nanopore sequencing data and two genetic maps and was then error-corrected using Illumina short-reads and PacBio SMRT reads. The 933.12 Mb assembly was annotated as containing 79,870 transcripts for 58,550 genes using RNA-Seq data and the public soybean annotation set. Comparison of the Hwangkeum assembly with the Williams 82 soybean reference genome sequence (Wm82.a2.v1) revealed 1.8 million single-nucleotide polymorphisms, 0.5 million indels, and 25 thousand putative structural variants. However, there was no natural megabase-scale chromosomal rearrangement. Incidentally, by adding two novel subfamilies, we found that soybean contains four clearly separated subfamilies of centromeric satellite repeats. Analyses of satellite repeats and gene content suggested that the Hwangkeum assembly is a high-quality assembly. This was further supported by comparison of the marker arrangement of anthocyanin biosynthesis genes and of gene arrangement at the Rsv3 locus. Therefore, the results indicate that the de novo assembly of Hwangkeum is a valuable additional reference genome resource for characterizing traits for the improvement of this important crop species.
format Online
Article
Text
id pubmed-8496230
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-84962302021-10-07 Genome assembly of the popular Korean soybean cultivar Hwangkeum Kim, Myung-Shin Lee, Taeyoung Baek, Jeonghun Kim, Ji Hong Kim, Changhoon Jeong, Soon-Chun G3 (Bethesda) Genome Report Massive resequencing efforts have been undertaken to catalog allelic variants in major crop species including soybean, but the scope of the information for genetic variation often depends on short sequence reads mapped to the extant reference genome. Additional de novo assembled genome sequences provide a unique opportunity to explore a dispensable genome fraction in the pan-genome of a species. Here, we report the de novo assembly and annotation of Hwangkeum, a popular soybean cultivar in Korea. The assembly was constructed using PromethION nanopore sequencing data and two genetic maps and was then error-corrected using Illumina short-reads and PacBio SMRT reads. The 933.12 Mb assembly was annotated as containing 79,870 transcripts for 58,550 genes using RNA-Seq data and the public soybean annotation set. Comparison of the Hwangkeum assembly with the Williams 82 soybean reference genome sequence (Wm82.a2.v1) revealed 1.8 million single-nucleotide polymorphisms, 0.5 million indels, and 25 thousand putative structural variants. However, there was no natural megabase-scale chromosomal rearrangement. Incidentally, by adding two novel subfamilies, we found that soybean contains four clearly separated subfamilies of centromeric satellite repeats. Analyses of satellite repeats and gene content suggested that the Hwangkeum assembly is a high-quality assembly. This was further supported by comparison of the marker arrangement of anthocyanin biosynthesis genes and of gene arrangement at the Rsv3 locus. Therefore, the results indicate that the de novo assembly of Hwangkeum is a valuable additional reference genome resource for characterizing traits for the improvement of this important crop species. Oxford University Press 2021-07-30 /pmc/articles/PMC8496230/ /pubmed/34568925 http://dx.doi.org/10.1093/g3journal/jkab272 Text en © The Author(s) 2021. Published by Oxford University Press on behalf of Genetics Society of America. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) ), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Genome Report
Kim, Myung-Shin
Lee, Taeyoung
Baek, Jeonghun
Kim, Ji Hong
Kim, Changhoon
Jeong, Soon-Chun
Genome assembly of the popular Korean soybean cultivar Hwangkeum
title Genome assembly of the popular Korean soybean cultivar Hwangkeum
title_full Genome assembly of the popular Korean soybean cultivar Hwangkeum
title_fullStr Genome assembly of the popular Korean soybean cultivar Hwangkeum
title_full_unstemmed Genome assembly of the popular Korean soybean cultivar Hwangkeum
title_short Genome assembly of the popular Korean soybean cultivar Hwangkeum
title_sort genome assembly of the popular korean soybean cultivar hwangkeum
topic Genome Report
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8496230/
https://www.ncbi.nlm.nih.gov/pubmed/34568925
http://dx.doi.org/10.1093/g3journal/jkab272
work_keys_str_mv AT kimmyungshin genomeassemblyofthepopularkoreansoybeancultivarhwangkeum
AT leetaeyoung genomeassemblyofthepopularkoreansoybeancultivarhwangkeum
AT baekjeonghun genomeassemblyofthepopularkoreansoybeancultivarhwangkeum
AT kimjihong genomeassemblyofthepopularkoreansoybeancultivarhwangkeum
AT kimchanghoon genomeassemblyofthepopularkoreansoybeancultivarhwangkeum
AT jeongsoonchun genomeassemblyofthepopularkoreansoybeancultivarhwangkeum