Cargando…

Single-nucleotide polymorphism discovery by high-throughput sequencing in sorghum

BACKGROUND: Eight diverse sorghum (Sorghum bicolor L. Moench) accessions were subjected to short-read genome sequencing to characterize the distribution of single-nucleotide polymorphisms (SNPs). Two strategies were used for DNA library preparation. Missing SNP genotype data were imputed by local ha...

Descripción completa

Detalles Bibliográficos
Autores principales: Nelson, James C, Wang, Shichen, Wu, Yuye, Li, Xianran, Antony, Ginny, White, Frank F, Yu, Jianming
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3146956/
https://www.ncbi.nlm.nih.gov/pubmed/21736744
http://dx.doi.org/10.1186/1471-2164-12-352
_version_ 1782209273183338496
author Nelson, James C
Wang, Shichen
Wu, Yuye
Li, Xianran
Antony, Ginny
White, Frank F
Yu, Jianming
author_facet Nelson, James C
Wang, Shichen
Wu, Yuye
Li, Xianran
Antony, Ginny
White, Frank F
Yu, Jianming
author_sort Nelson, James C
collection PubMed
description BACKGROUND: Eight diverse sorghum (Sorghum bicolor L. Moench) accessions were subjected to short-read genome sequencing to characterize the distribution of single-nucleotide polymorphisms (SNPs). Two strategies were used for DNA library preparation. Missing SNP genotype data were imputed by local haplotype comparison. The effect of library type and genomic diversity on SNP discovery and imputation are evaluated. RESULTS: Alignment of eight genome equivalents (6 Gb) to the public reference genome revealed 283,000 SNPs at ≥82% confirmation probability. Sequencing from libraries constructed to limit sequencing to start at defined restriction sites led to genotyping 10-fold more SNPs in all 8 accessions, and correctly imputing 11% more missing data, than from semirandom libraries. The SNP yield advantage of the reduced-representation method was less than expected, since up to one fifth of reads started at noncanonical restriction sites and up to one third of restriction sites predicted in silico to yield unique alignments were not sampled at near-saturation. For imputation accuracy, the availability of a genomically similar accession in the germplasm panel was more important than panel size or sequencing coverage. CONCLUSIONS: A sequence quantity of 3 million 50-base reads per accession using a BsrFI library would conservatively provide satisfactory genotyping of 96,000 sorghum SNPs. For most reliable SNP-genotype imputation in shallowly sequenced genomes, germplasm panels should consist of pairs or groups of genomically similar entries. These results may help in designing strategies for economical genotyping-by-sequencing of large numbers of plant accessions.
format Online
Article
Text
id pubmed-3146956
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-31469562011-07-31 Single-nucleotide polymorphism discovery by high-throughput sequencing in sorghum Nelson, James C Wang, Shichen Wu, Yuye Li, Xianran Antony, Ginny White, Frank F Yu, Jianming BMC Genomics Research Article BACKGROUND: Eight diverse sorghum (Sorghum bicolor L. Moench) accessions were subjected to short-read genome sequencing to characterize the distribution of single-nucleotide polymorphisms (SNPs). Two strategies were used for DNA library preparation. Missing SNP genotype data were imputed by local haplotype comparison. The effect of library type and genomic diversity on SNP discovery and imputation are evaluated. RESULTS: Alignment of eight genome equivalents (6 Gb) to the public reference genome revealed 283,000 SNPs at ≥82% confirmation probability. Sequencing from libraries constructed to limit sequencing to start at defined restriction sites led to genotyping 10-fold more SNPs in all 8 accessions, and correctly imputing 11% more missing data, than from semirandom libraries. The SNP yield advantage of the reduced-representation method was less than expected, since up to one fifth of reads started at noncanonical restriction sites and up to one third of restriction sites predicted in silico to yield unique alignments were not sampled at near-saturation. For imputation accuracy, the availability of a genomically similar accession in the germplasm panel was more important than panel size or sequencing coverage. CONCLUSIONS: A sequence quantity of 3 million 50-base reads per accession using a BsrFI library would conservatively provide satisfactory genotyping of 96,000 sorghum SNPs. For most reliable SNP-genotype imputation in shallowly sequenced genomes, germplasm panels should consist of pairs or groups of genomically similar entries. These results may help in designing strategies for economical genotyping-by-sequencing of large numbers of plant accessions. BioMed Central 2011-07-07 /pmc/articles/PMC3146956/ /pubmed/21736744 http://dx.doi.org/10.1186/1471-2164-12-352 Text en Copyright ©2011 Nelson et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Nelson, James C
Wang, Shichen
Wu, Yuye
Li, Xianran
Antony, Ginny
White, Frank F
Yu, Jianming
Single-nucleotide polymorphism discovery by high-throughput sequencing in sorghum
title Single-nucleotide polymorphism discovery by high-throughput sequencing in sorghum
title_full Single-nucleotide polymorphism discovery by high-throughput sequencing in sorghum
title_fullStr Single-nucleotide polymorphism discovery by high-throughput sequencing in sorghum
title_full_unstemmed Single-nucleotide polymorphism discovery by high-throughput sequencing in sorghum
title_short Single-nucleotide polymorphism discovery by high-throughput sequencing in sorghum
title_sort single-nucleotide polymorphism discovery by high-throughput sequencing in sorghum
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3146956/
https://www.ncbi.nlm.nih.gov/pubmed/21736744
http://dx.doi.org/10.1186/1471-2164-12-352
work_keys_str_mv AT nelsonjamesc singlenucleotidepolymorphismdiscoverybyhighthroughputsequencinginsorghum
AT wangshichen singlenucleotidepolymorphismdiscoverybyhighthroughputsequencinginsorghum
AT wuyuye singlenucleotidepolymorphismdiscoverybyhighthroughputsequencinginsorghum
AT lixianran singlenucleotidepolymorphismdiscoverybyhighthroughputsequencinginsorghum
AT antonyginny singlenucleotidepolymorphismdiscoverybyhighthroughputsequencinginsorghum
AT whitefrankf singlenucleotidepolymorphismdiscoverybyhighthroughputsequencinginsorghum
AT yujianming singlenucleotidepolymorphismdiscoverybyhighthroughputsequencinginsorghum