Cargando…
Genome assembly of the popular Korean soybean cultivar Hwangkeum
Massive resequencing efforts have been undertaken to catalog allelic variants in major crop species including soybean, but the scope of the information for genetic variation often depends on short sequence reads mapped to the extant reference genome. Additional de novo assembled genome sequences pro...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8496230/ https://www.ncbi.nlm.nih.gov/pubmed/34568925 http://dx.doi.org/10.1093/g3journal/jkab272 |
_version_ | 1784579711638175744 |
---|---|
author | Kim, Myung-Shin Lee, Taeyoung Baek, Jeonghun Kim, Ji Hong Kim, Changhoon Jeong, Soon-Chun |
author_facet | Kim, Myung-Shin Lee, Taeyoung Baek, Jeonghun Kim, Ji Hong Kim, Changhoon Jeong, Soon-Chun |
author_sort | Kim, Myung-Shin |
collection | PubMed |
description | Massive resequencing efforts have been undertaken to catalog allelic variants in major crop species including soybean, but the scope of the information for genetic variation often depends on short sequence reads mapped to the extant reference genome. Additional de novo assembled genome sequences provide a unique opportunity to explore a dispensable genome fraction in the pan-genome of a species. Here, we report the de novo assembly and annotation of Hwangkeum, a popular soybean cultivar in Korea. The assembly was constructed using PromethION nanopore sequencing data and two genetic maps and was then error-corrected using Illumina short-reads and PacBio SMRT reads. The 933.12 Mb assembly was annotated as containing 79,870 transcripts for 58,550 genes using RNA-Seq data and the public soybean annotation set. Comparison of the Hwangkeum assembly with the Williams 82 soybean reference genome sequence (Wm82.a2.v1) revealed 1.8 million single-nucleotide polymorphisms, 0.5 million indels, and 25 thousand putative structural variants. However, there was no natural megabase-scale chromosomal rearrangement. Incidentally, by adding two novel subfamilies, we found that soybean contains four clearly separated subfamilies of centromeric satellite repeats. Analyses of satellite repeats and gene content suggested that the Hwangkeum assembly is a high-quality assembly. This was further supported by comparison of the marker arrangement of anthocyanin biosynthesis genes and of gene arrangement at the Rsv3 locus. Therefore, the results indicate that the de novo assembly of Hwangkeum is a valuable additional reference genome resource for characterizing traits for the improvement of this important crop species. |
format | Online Article Text |
id | pubmed-8496230 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-84962302021-10-07 Genome assembly of the popular Korean soybean cultivar Hwangkeum Kim, Myung-Shin Lee, Taeyoung Baek, Jeonghun Kim, Ji Hong Kim, Changhoon Jeong, Soon-Chun G3 (Bethesda) Genome Report Massive resequencing efforts have been undertaken to catalog allelic variants in major crop species including soybean, but the scope of the information for genetic variation often depends on short sequence reads mapped to the extant reference genome. Additional de novo assembled genome sequences provide a unique opportunity to explore a dispensable genome fraction in the pan-genome of a species. Here, we report the de novo assembly and annotation of Hwangkeum, a popular soybean cultivar in Korea. The assembly was constructed using PromethION nanopore sequencing data and two genetic maps and was then error-corrected using Illumina short-reads and PacBio SMRT reads. The 933.12 Mb assembly was annotated as containing 79,870 transcripts for 58,550 genes using RNA-Seq data and the public soybean annotation set. Comparison of the Hwangkeum assembly with the Williams 82 soybean reference genome sequence (Wm82.a2.v1) revealed 1.8 million single-nucleotide polymorphisms, 0.5 million indels, and 25 thousand putative structural variants. However, there was no natural megabase-scale chromosomal rearrangement. Incidentally, by adding two novel subfamilies, we found that soybean contains four clearly separated subfamilies of centromeric satellite repeats. Analyses of satellite repeats and gene content suggested that the Hwangkeum assembly is a high-quality assembly. This was further supported by comparison of the marker arrangement of anthocyanin biosynthesis genes and of gene arrangement at the Rsv3 locus. Therefore, the results indicate that the de novo assembly of Hwangkeum is a valuable additional reference genome resource for characterizing traits for the improvement of this important crop species. Oxford University Press 2021-07-30 /pmc/articles/PMC8496230/ /pubmed/34568925 http://dx.doi.org/10.1093/g3journal/jkab272 Text en © The Author(s) 2021. Published by Oxford University Press on behalf of Genetics Society of America. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) ), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Genome Report Kim, Myung-Shin Lee, Taeyoung Baek, Jeonghun Kim, Ji Hong Kim, Changhoon Jeong, Soon-Chun Genome assembly of the popular Korean soybean cultivar Hwangkeum |
title | Genome assembly of the popular Korean soybean cultivar Hwangkeum |
title_full | Genome assembly of the popular Korean soybean cultivar Hwangkeum |
title_fullStr | Genome assembly of the popular Korean soybean cultivar Hwangkeum |
title_full_unstemmed | Genome assembly of the popular Korean soybean cultivar Hwangkeum |
title_short | Genome assembly of the popular Korean soybean cultivar Hwangkeum |
title_sort | genome assembly of the popular korean soybean cultivar hwangkeum |
topic | Genome Report |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8496230/ https://www.ncbi.nlm.nih.gov/pubmed/34568925 http://dx.doi.org/10.1093/g3journal/jkab272 |
work_keys_str_mv | AT kimmyungshin genomeassemblyofthepopularkoreansoybeancultivarhwangkeum AT leetaeyoung genomeassemblyofthepopularkoreansoybeancultivarhwangkeum AT baekjeonghun genomeassemblyofthepopularkoreansoybeancultivarhwangkeum AT kimjihong genomeassemblyofthepopularkoreansoybeancultivarhwangkeum AT kimchanghoon genomeassemblyofthepopularkoreansoybeancultivarhwangkeum AT jeongsoonchun genomeassemblyofthepopularkoreansoybeancultivarhwangkeum |