Cargando…
Re-annotation of the physical map of Glycine max for polyploid-like regions by BAC end sequence driven whole genome shotgun read assembly
BACKGROUND: Many of the world's most important food crops have either polyploid genomes or homeologous regions derived from segmental shuffling following polyploid formation. The soybean (Glycine max) genome has been shown to be composed of approximately four thousand short interspersed homeolo...
Autores principales: | , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2008
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2478686/ https://www.ncbi.nlm.nih.gov/pubmed/18606011 http://dx.doi.org/10.1186/1471-2164-9-323 |
_version_ | 1782157618291146752 |
---|---|
author | Saini, Navinder Shultz, Jeffry Lightfoot, David A |
author_facet | Saini, Navinder Shultz, Jeffry Lightfoot, David A |
author_sort | Saini, Navinder |
collection | PubMed |
description | BACKGROUND: Many of the world's most important food crops have either polyploid genomes or homeologous regions derived from segmental shuffling following polyploid formation. The soybean (Glycine max) genome has been shown to be composed of approximately four thousand short interspersed homeologous regions with 1, 2 or 4 copies per haploid genome by RFLP analysis, microsatellite anchors to BACs and by contigs formed from BAC fingerprints. Despite these similar regions,, the genome has been sequenced by whole genome shotgun sequence (WGS). Here the aim was to use BAC end sequences (BES) derived from three minimum tile paths (MTP) to examine the extent and homogeneity of polyploid-like regions within contigs and the extent of correlation between the polyploid-like regions inferred from fingerprinting and the polyploid-like sequences inferred from WGS matches. RESULTS: Results show that when sequence divergence was 1–10%, the copy number of homeologous regions could be identified from sequence variation in WGS reads overlapping BES. Homeolog sequence variants (HSVs) were single nucleotide polymorphisms (SNPs; 89%) and single nucleotide indels (SNIs 10%). Larger indels were rare but present (1%). Simulations that had predicted fingerprints of homeologous regions could be separated when divergence exceeded 2% were shown to be false. We show that a 5–10% sequence divergence is necessary to separate homeologs by fingerprinting. BES compared to WGS traces showed polyploid-like regions with less than 1% sequence divergence exist at 2.3% of the locations assayed. CONCLUSION: The use of HSVs like SNPs and SNIs to characterize BACs wil improve contig building methods. The implications for bioinformatic and functional annotation of polyploid and paleopolyploid genomes show that a combined approach of BAC fingerprint based physical maps, WGS sequence and HSV-based partitioning of BAC clones from homeologous regions to separate contigs will allow reliable de-convolution and positioning of sequence scaffolds (see BES_scaffolds section of SoyGD). This approach will assist genome annotation for paleopolyploid and true polyploid genomes such as soybean and many important cereal and fruit crops. |
format | Text |
id | pubmed-2478686 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2008 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-24786862008-07-22 Re-annotation of the physical map of Glycine max for polyploid-like regions by BAC end sequence driven whole genome shotgun read assembly Saini, Navinder Shultz, Jeffry Lightfoot, David A BMC Genomics Research Article BACKGROUND: Many of the world's most important food crops have either polyploid genomes or homeologous regions derived from segmental shuffling following polyploid formation. The soybean (Glycine max) genome has been shown to be composed of approximately four thousand short interspersed homeologous regions with 1, 2 or 4 copies per haploid genome by RFLP analysis, microsatellite anchors to BACs and by contigs formed from BAC fingerprints. Despite these similar regions,, the genome has been sequenced by whole genome shotgun sequence (WGS). Here the aim was to use BAC end sequences (BES) derived from three minimum tile paths (MTP) to examine the extent and homogeneity of polyploid-like regions within contigs and the extent of correlation between the polyploid-like regions inferred from fingerprinting and the polyploid-like sequences inferred from WGS matches. RESULTS: Results show that when sequence divergence was 1–10%, the copy number of homeologous regions could be identified from sequence variation in WGS reads overlapping BES. Homeolog sequence variants (HSVs) were single nucleotide polymorphisms (SNPs; 89%) and single nucleotide indels (SNIs 10%). Larger indels were rare but present (1%). Simulations that had predicted fingerprints of homeologous regions could be separated when divergence exceeded 2% were shown to be false. We show that a 5–10% sequence divergence is necessary to separate homeologs by fingerprinting. BES compared to WGS traces showed polyploid-like regions with less than 1% sequence divergence exist at 2.3% of the locations assayed. CONCLUSION: The use of HSVs like SNPs and SNIs to characterize BACs wil improve contig building methods. The implications for bioinformatic and functional annotation of polyploid and paleopolyploid genomes show that a combined approach of BAC fingerprint based physical maps, WGS sequence and HSV-based partitioning of BAC clones from homeologous regions to separate contigs will allow reliable de-convolution and positioning of sequence scaffolds (see BES_scaffolds section of SoyGD). This approach will assist genome annotation for paleopolyploid and true polyploid genomes such as soybean and many important cereal and fruit crops. BioMed Central 2008-07-07 /pmc/articles/PMC2478686/ /pubmed/18606011 http://dx.doi.org/10.1186/1471-2164-9-323 Text en Copyright © 2008 Saini et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Saini, Navinder Shultz, Jeffry Lightfoot, David A Re-annotation of the physical map of Glycine max for polyploid-like regions by BAC end sequence driven whole genome shotgun read assembly |
title | Re-annotation of the physical map of Glycine max for polyploid-like regions by BAC end sequence driven whole genome shotgun read assembly |
title_full | Re-annotation of the physical map of Glycine max for polyploid-like regions by BAC end sequence driven whole genome shotgun read assembly |
title_fullStr | Re-annotation of the physical map of Glycine max for polyploid-like regions by BAC end sequence driven whole genome shotgun read assembly |
title_full_unstemmed | Re-annotation of the physical map of Glycine max for polyploid-like regions by BAC end sequence driven whole genome shotgun read assembly |
title_short | Re-annotation of the physical map of Glycine max for polyploid-like regions by BAC end sequence driven whole genome shotgun read assembly |
title_sort | re-annotation of the physical map of glycine max for polyploid-like regions by bac end sequence driven whole genome shotgun read assembly |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2478686/ https://www.ncbi.nlm.nih.gov/pubmed/18606011 http://dx.doi.org/10.1186/1471-2164-9-323 |
work_keys_str_mv | AT saininavinder reannotationofthephysicalmapofglycinemaxforpolyploidlikeregionsbybacendsequencedrivenwholegenomeshotgunreadassembly AT shultzjeffry reannotationofthephysicalmapofglycinemaxforpolyploidlikeregionsbybacendsequencedrivenwholegenomeshotgunreadassembly AT lightfootdavida reannotationofthephysicalmapofglycinemaxforpolyploidlikeregionsbybacendsequencedrivenwholegenomeshotgunreadassembly |