Cargando…

Gene duplication and paleopolyploidy in soybean and the implications for whole genome sequencing

BACKGROUND: Soybean, Glycine max (L.) Merr., is a well documented paleopolyploid. What remains relatively under characterized is the level of sequence identity in retained homeologous regions of the genome. Recently, the Department of Energy Joint Genome Institute and United States Department of Agr...

Descripción completa

Detalles Bibliográficos
Autores principales: Schlueter, Jessica A, Lin, Jer-Young, Schlueter, Shannon D, Vasylenko-Sanders, Iryna F, Deshpande, Shweta, Yi, Jing, O'Bleness, Majesta, Roe, Bruce A, Nelson, Rex T, Scheffler, Brian E, Jackson, Scott A, Shoemaker, Randy C
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2007
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2077340/
https://www.ncbi.nlm.nih.gov/pubmed/17880721
http://dx.doi.org/10.1186/1471-2164-8-330
_version_ 1782138113991114752
author Schlueter, Jessica A
Lin, Jer-Young
Schlueter, Shannon D
Vasylenko-Sanders, Iryna F
Deshpande, Shweta
Yi, Jing
O'Bleness, Majesta
Roe, Bruce A
Nelson, Rex T
Scheffler, Brian E
Jackson, Scott A
Shoemaker, Randy C
author_facet Schlueter, Jessica A
Lin, Jer-Young
Schlueter, Shannon D
Vasylenko-Sanders, Iryna F
Deshpande, Shweta
Yi, Jing
O'Bleness, Majesta
Roe, Bruce A
Nelson, Rex T
Scheffler, Brian E
Jackson, Scott A
Shoemaker, Randy C
author_sort Schlueter, Jessica A
collection PubMed
description BACKGROUND: Soybean, Glycine max (L.) Merr., is a well documented paleopolyploid. What remains relatively under characterized is the level of sequence identity in retained homeologous regions of the genome. Recently, the Department of Energy Joint Genome Institute and United States Department of Agriculture jointly announced the sequencing of the soybean genome. One of the initial concerns is to what extent sequence identity in homeologous regions would have on whole genome shotgun sequence assembly. RESULTS: Seventeen BACs representing ~2.03 Mb were sequenced as representative potential homeologous regions from the soybean genome. Genetic mapping of each BAC shows that 11 of the 20 chromosomes are represented. Sequence comparisons between homeologous BACs shows that the soybean genome is a mosaic of retained paleopolyploid regions. Some regions appear to be highly conserved while other regions have diverged significantly. Large-scale "batch" reassembly of all 17 BACs combined showed that even the most homeologous BACs with upwards of 95% sequence identity resolve into their respective homeologous sequences. Potential assembly errors were generated by tandemly duplicated pentatricopeptide repeat containing genes and long simple sequence repeats. Analysis of a whole-genome shotgun assembly of 80,000 randomly chosen JGI-DOE sequence traces reveals some new soybean-specific repeat sequences. CONCLUSION: This analysis investigated both the structure of the paleopolyploid soybean genome and the potential effects retained homeology will have on assembling the whole genome shotgun sequence. Based upon these results, homeologous regions similar to those characterized here will not cause major assembly issues.
format Text
id pubmed-2077340
institution National Center for Biotechnology Information
language English
publishDate 2007
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-20773402007-11-14 Gene duplication and paleopolyploidy in soybean and the implications for whole genome sequencing Schlueter, Jessica A Lin, Jer-Young Schlueter, Shannon D Vasylenko-Sanders, Iryna F Deshpande, Shweta Yi, Jing O'Bleness, Majesta Roe, Bruce A Nelson, Rex T Scheffler, Brian E Jackson, Scott A Shoemaker, Randy C BMC Genomics Research Article BACKGROUND: Soybean, Glycine max (L.) Merr., is a well documented paleopolyploid. What remains relatively under characterized is the level of sequence identity in retained homeologous regions of the genome. Recently, the Department of Energy Joint Genome Institute and United States Department of Agriculture jointly announced the sequencing of the soybean genome. One of the initial concerns is to what extent sequence identity in homeologous regions would have on whole genome shotgun sequence assembly. RESULTS: Seventeen BACs representing ~2.03 Mb were sequenced as representative potential homeologous regions from the soybean genome. Genetic mapping of each BAC shows that 11 of the 20 chromosomes are represented. Sequence comparisons between homeologous BACs shows that the soybean genome is a mosaic of retained paleopolyploid regions. Some regions appear to be highly conserved while other regions have diverged significantly. Large-scale "batch" reassembly of all 17 BACs combined showed that even the most homeologous BACs with upwards of 95% sequence identity resolve into their respective homeologous sequences. Potential assembly errors were generated by tandemly duplicated pentatricopeptide repeat containing genes and long simple sequence repeats. Analysis of a whole-genome shotgun assembly of 80,000 randomly chosen JGI-DOE sequence traces reveals some new soybean-specific repeat sequences. CONCLUSION: This analysis investigated both the structure of the paleopolyploid soybean genome and the potential effects retained homeology will have on assembling the whole genome shotgun sequence. Based upon these results, homeologous regions similar to those characterized here will not cause major assembly issues. BioMed Central 2007-09-19 /pmc/articles/PMC2077340/ /pubmed/17880721 http://dx.doi.org/10.1186/1471-2164-8-330 Text en Copyright © 2007 Schlueter et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Schlueter, Jessica A
Lin, Jer-Young
Schlueter, Shannon D
Vasylenko-Sanders, Iryna F
Deshpande, Shweta
Yi, Jing
O'Bleness, Majesta
Roe, Bruce A
Nelson, Rex T
Scheffler, Brian E
Jackson, Scott A
Shoemaker, Randy C
Gene duplication and paleopolyploidy in soybean and the implications for whole genome sequencing
title Gene duplication and paleopolyploidy in soybean and the implications for whole genome sequencing
title_full Gene duplication and paleopolyploidy in soybean and the implications for whole genome sequencing
title_fullStr Gene duplication and paleopolyploidy in soybean and the implications for whole genome sequencing
title_full_unstemmed Gene duplication and paleopolyploidy in soybean and the implications for whole genome sequencing
title_short Gene duplication and paleopolyploidy in soybean and the implications for whole genome sequencing
title_sort gene duplication and paleopolyploidy in soybean and the implications for whole genome sequencing
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2077340/
https://www.ncbi.nlm.nih.gov/pubmed/17880721
http://dx.doi.org/10.1186/1471-2164-8-330
work_keys_str_mv AT schlueterjessicaa geneduplicationandpaleopolyploidyinsoybeanandtheimplicationsforwholegenomesequencing
AT linjeryoung geneduplicationandpaleopolyploidyinsoybeanandtheimplicationsforwholegenomesequencing
AT schluetershannond geneduplicationandpaleopolyploidyinsoybeanandtheimplicationsforwholegenomesequencing
AT vasylenkosandersirynaf geneduplicationandpaleopolyploidyinsoybeanandtheimplicationsforwholegenomesequencing
AT deshpandeshweta geneduplicationandpaleopolyploidyinsoybeanandtheimplicationsforwholegenomesequencing
AT yijing geneduplicationandpaleopolyploidyinsoybeanandtheimplicationsforwholegenomesequencing
AT oblenessmajesta geneduplicationandpaleopolyploidyinsoybeanandtheimplicationsforwholegenomesequencing
AT roebrucea geneduplicationandpaleopolyploidyinsoybeanandtheimplicationsforwholegenomesequencing
AT nelsonrext geneduplicationandpaleopolyploidyinsoybeanandtheimplicationsforwholegenomesequencing
AT schefflerbriane geneduplicationandpaleopolyploidyinsoybeanandtheimplicationsforwholegenomesequencing
AT jacksonscotta geneduplicationandpaleopolyploidyinsoybeanandtheimplicationsforwholegenomesequencing
AT shoemakerrandyc geneduplicationandpaleopolyploidyinsoybeanandtheimplicationsforwholegenomesequencing