Cargando…

Discovery of Single Nucleotide Polymorphisms in Complex Genomes Using SGSautoSNP

Single nucleotide polymorphisms (SNPs) are becoming the dominant form of molecular marker for genetic and genomic analysis. The advances in second generation DNA sequencing provide opportunities to identify very large numbers of SNPs in a range of species. However, SNP identification remains a chall...

Descripción completa

Detalles Bibliográficos
Autores principales: Lorenc, Michał T., Hayashi, Satomi, Stiller, Jiri, Lee, Hong, Manoli, Sahana, Ruperao, Pradeep, Visendi, Paul, Berkman, Paul J., Lai, Kaitao, Batley, Jacqueline, Edwards, David
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4009776/
https://www.ncbi.nlm.nih.gov/pubmed/24832230
http://dx.doi.org/10.3390/biology1020370
_version_ 1782479802661339136
author Lorenc, Michał T.
Hayashi, Satomi
Stiller, Jiri
Lee, Hong
Manoli, Sahana
Ruperao, Pradeep
Visendi, Paul
Berkman, Paul J.
Lai, Kaitao
Batley, Jacqueline
Edwards, David
author_facet Lorenc, Michał T.
Hayashi, Satomi
Stiller, Jiri
Lee, Hong
Manoli, Sahana
Ruperao, Pradeep
Visendi, Paul
Berkman, Paul J.
Lai, Kaitao
Batley, Jacqueline
Edwards, David
author_sort Lorenc, Michał T.
collection PubMed
description Single nucleotide polymorphisms (SNPs) are becoming the dominant form of molecular marker for genetic and genomic analysis. The advances in second generation DNA sequencing provide opportunities to identify very large numbers of SNPs in a range of species. However, SNP identification remains a challenge for large and polyploid genomes due to their size and complexity. We have developed a pipeline for the robust identification of SNPs in large and complex genomes using Illumina second generation DNA sequence data and demonstrated this by the discovery of SNPs in the hexaploid wheat genome. We have developed a SNP discovery pipeline called SGSautoSNP (Second-Generation Sequencing AutoSNP) and applied this to discover more than 800,000 SNPs between four hexaploid wheat cultivars across chromosomes 7A, 7B and 7D. All SNPs are presented for download and viewing within a public GBrowse database. Validation suggests an accuracy of greater than 93% of SNPs represent polymorphisms between wheat cultivars and hence are valuable for detailed diversity analysis, marker assisted selection and genotyping by sequencing. The pipeline produces output in GFF3, VCF, Flapjack or Illumina Infinium design format for further genotyping diverse populations. As well as providing an unprecedented resource for wheat diversity analysis, the method establishes a foundation for high resolution SNP discovery in other large and complex genomes.
format Online
Article
Text
id pubmed-4009776
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-40097762014-05-07 Discovery of Single Nucleotide Polymorphisms in Complex Genomes Using SGSautoSNP Lorenc, Michał T. Hayashi, Satomi Stiller, Jiri Lee, Hong Manoli, Sahana Ruperao, Pradeep Visendi, Paul Berkman, Paul J. Lai, Kaitao Batley, Jacqueline Edwards, David Biology (Basel) Article Single nucleotide polymorphisms (SNPs) are becoming the dominant form of molecular marker for genetic and genomic analysis. The advances in second generation DNA sequencing provide opportunities to identify very large numbers of SNPs in a range of species. However, SNP identification remains a challenge for large and polyploid genomes due to their size and complexity. We have developed a pipeline for the robust identification of SNPs in large and complex genomes using Illumina second generation DNA sequence data and demonstrated this by the discovery of SNPs in the hexaploid wheat genome. We have developed a SNP discovery pipeline called SGSautoSNP (Second-Generation Sequencing AutoSNP) and applied this to discover more than 800,000 SNPs between four hexaploid wheat cultivars across chromosomes 7A, 7B and 7D. All SNPs are presented for download and viewing within a public GBrowse database. Validation suggests an accuracy of greater than 93% of SNPs represent polymorphisms between wheat cultivars and hence are valuable for detailed diversity analysis, marker assisted selection and genotyping by sequencing. The pipeline produces output in GFF3, VCF, Flapjack or Illumina Infinium design format for further genotyping diverse populations. As well as providing an unprecedented resource for wheat diversity analysis, the method establishes a foundation for high resolution SNP discovery in other large and complex genomes. MDPI 2012-08-27 /pmc/articles/PMC4009776/ /pubmed/24832230 http://dx.doi.org/10.3390/biology1020370 Text en © 2012 by the authors; licensee MDPI, Basel, Switzerland. http://creativecommons.org/licenses/by/3.0/ This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).
spellingShingle Article
Lorenc, Michał T.
Hayashi, Satomi
Stiller, Jiri
Lee, Hong
Manoli, Sahana
Ruperao, Pradeep
Visendi, Paul
Berkman, Paul J.
Lai, Kaitao
Batley, Jacqueline
Edwards, David
Discovery of Single Nucleotide Polymorphisms in Complex Genomes Using SGSautoSNP
title Discovery of Single Nucleotide Polymorphisms in Complex Genomes Using SGSautoSNP
title_full Discovery of Single Nucleotide Polymorphisms in Complex Genomes Using SGSautoSNP
title_fullStr Discovery of Single Nucleotide Polymorphisms in Complex Genomes Using SGSautoSNP
title_full_unstemmed Discovery of Single Nucleotide Polymorphisms in Complex Genomes Using SGSautoSNP
title_short Discovery of Single Nucleotide Polymorphisms in Complex Genomes Using SGSautoSNP
title_sort discovery of single nucleotide polymorphisms in complex genomes using sgsautosnp
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4009776/
https://www.ncbi.nlm.nih.gov/pubmed/24832230
http://dx.doi.org/10.3390/biology1020370
work_keys_str_mv AT lorencmichałt discoveryofsinglenucleotidepolymorphismsincomplexgenomesusingsgsautosnp
AT hayashisatomi discoveryofsinglenucleotidepolymorphismsincomplexgenomesusingsgsautosnp
AT stillerjiri discoveryofsinglenucleotidepolymorphismsincomplexgenomesusingsgsautosnp
AT leehong discoveryofsinglenucleotidepolymorphismsincomplexgenomesusingsgsautosnp
AT manolisahana discoveryofsinglenucleotidepolymorphismsincomplexgenomesusingsgsautosnp
AT ruperaopradeep discoveryofsinglenucleotidepolymorphismsincomplexgenomesusingsgsautosnp
AT visendipaul discoveryofsinglenucleotidepolymorphismsincomplexgenomesusingsgsautosnp
AT berkmanpaulj discoveryofsinglenucleotidepolymorphismsincomplexgenomesusingsgsautosnp
AT laikaitao discoveryofsinglenucleotidepolymorphismsincomplexgenomesusingsgsautosnp
AT batleyjacqueline discoveryofsinglenucleotidepolymorphismsincomplexgenomesusingsgsautosnp
AT edwardsdavid discoveryofsinglenucleotidepolymorphismsincomplexgenomesusingsgsautosnp