Cargando…

The pan‐genome of the cultivated soybean (PanSoy) reveals an extraordinarily conserved gene content

Studies on structural variation in plants have revealed the inadequacy of a single reference genome for an entire species and suggest that it is necessary to build a species‐representative genome called a pan‐genome to better capture the extent of both structural and nucleotide variation. Here, we p...

Descripción completa

Detalles Bibliográficos
Autores principales: Torkamaneh, Davoud, Lemay, Marc‐André, Belzile, François
Formato: Online Artículo Texto
Lenguaje:English
Publicado: John Wiley and Sons Inc. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8428833/
https://www.ncbi.nlm.nih.gov/pubmed/33942475
http://dx.doi.org/10.1111/pbi.13600
_version_ 1783750448618930176
author Torkamaneh, Davoud
Lemay, Marc‐André
Belzile, François
author_facet Torkamaneh, Davoud
Lemay, Marc‐André
Belzile, François
author_sort Torkamaneh, Davoud
collection PubMed
description Studies on structural variation in plants have revealed the inadequacy of a single reference genome for an entire species and suggest that it is necessary to build a species‐representative genome called a pan‐genome to better capture the extent of both structural and nucleotide variation. Here, we present a pan‐genome of cultivated soybean (Glycine max), termed PanSoy, constructed using the de novo genome assembly of 204 phylogenetically and geographically representative improved accessions selected from the larger GmHapMap collection. PanSoy uncovers 108 Mb (˜11%) of novel nonreference sequences encompassing 3621 protein‐coding genes (including 1659 novel genes) absent from the soybean ‘Williams 82’ reference genome. Nonetheless, the core genome represents an exceptionally large proportion of the genome, with >90.6% of genes being shared by >99% of the accessions. A majority of PAVs encompassing genes could be confirmed with long‐read sequencing on a subset of accessions. The PanSoy is a major step towards capturing the extent of genetic variation in cultivated soybean and provides a resource for soybean genomics research and breeding.
format Online
Article
Text
id pubmed-8428833
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher John Wiley and Sons Inc.
record_format MEDLINE/PubMed
spelling pubmed-84288332021-09-14 The pan‐genome of the cultivated soybean (PanSoy) reveals an extraordinarily conserved gene content Torkamaneh, Davoud Lemay, Marc‐André Belzile, François Plant Biotechnol J Research Articles Studies on structural variation in plants have revealed the inadequacy of a single reference genome for an entire species and suggest that it is necessary to build a species‐representative genome called a pan‐genome to better capture the extent of both structural and nucleotide variation. Here, we present a pan‐genome of cultivated soybean (Glycine max), termed PanSoy, constructed using the de novo genome assembly of 204 phylogenetically and geographically representative improved accessions selected from the larger GmHapMap collection. PanSoy uncovers 108 Mb (˜11%) of novel nonreference sequences encompassing 3621 protein‐coding genes (including 1659 novel genes) absent from the soybean ‘Williams 82’ reference genome. Nonetheless, the core genome represents an exceptionally large proportion of the genome, with >90.6% of genes being shared by >99% of the accessions. A majority of PAVs encompassing genes could be confirmed with long‐read sequencing on a subset of accessions. The PanSoy is a major step towards capturing the extent of genetic variation in cultivated soybean and provides a resource for soybean genomics research and breeding. John Wiley and Sons Inc. 2021-06-15 2021-09 /pmc/articles/PMC8428833/ /pubmed/33942475 http://dx.doi.org/10.1111/pbi.13600 Text en © 2021 The Authors. Plant Biotechnology Journal published by Society for Experimental Biology and The Association of Applied Biologists and John Wiley & Sons Ltd. https://creativecommons.org/licenses/by/4.0/This is an open access article under the terms of the http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Articles
Torkamaneh, Davoud
Lemay, Marc‐André
Belzile, François
The pan‐genome of the cultivated soybean (PanSoy) reveals an extraordinarily conserved gene content
title The pan‐genome of the cultivated soybean (PanSoy) reveals an extraordinarily conserved gene content
title_full The pan‐genome of the cultivated soybean (PanSoy) reveals an extraordinarily conserved gene content
title_fullStr The pan‐genome of the cultivated soybean (PanSoy) reveals an extraordinarily conserved gene content
title_full_unstemmed The pan‐genome of the cultivated soybean (PanSoy) reveals an extraordinarily conserved gene content
title_short The pan‐genome of the cultivated soybean (PanSoy) reveals an extraordinarily conserved gene content
title_sort pan‐genome of the cultivated soybean (pansoy) reveals an extraordinarily conserved gene content
topic Research Articles
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8428833/
https://www.ncbi.nlm.nih.gov/pubmed/33942475
http://dx.doi.org/10.1111/pbi.13600
work_keys_str_mv AT torkamanehdavoud thepangenomeofthecultivatedsoybeanpansoyrevealsanextraordinarilyconservedgenecontent
AT lemaymarcandre thepangenomeofthecultivatedsoybeanpansoyrevealsanextraordinarilyconservedgenecontent
AT belzilefrancois thepangenomeofthecultivatedsoybeanpansoyrevealsanextraordinarilyconservedgenecontent
AT torkamanehdavoud pangenomeofthecultivatedsoybeanpansoyrevealsanextraordinarilyconservedgenecontent
AT lemaymarcandre pangenomeofthecultivatedsoybeanpansoyrevealsanextraordinarilyconservedgenecontent
AT belzilefrancois pangenomeofthecultivatedsoybeanpansoyrevealsanextraordinarilyconservedgenecontent