Cargando…

Genome wide SNP discovery, analysis and evaluation in mallard (Anas platyrhynchos)

BACKGROUND: Next generation sequencing technologies allow to obtain at low cost the genomic sequence information that currently lacks for most economically and ecologically important organisms. For the mallard duck genomic data is limited. The mallard is, besides a species of large agricultural and...

Descripción completa

Detalles Bibliográficos
Autores principales: Kraus, Robert HS, Kerstens, Hindrik HD, Van Hooft, Pim, Crooijmans, Richard PMA, Van Der Poel, Jan J, Elmberg, Johan, Vignal, Alain, Huang, Yinhua, Li, Ning, Prins, Herbert HT, Groenen, Martien AM
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3065436/
https://www.ncbi.nlm.nih.gov/pubmed/21410945
http://dx.doi.org/10.1186/1471-2164-12-150
_version_ 1782200981583298560
author Kraus, Robert HS
Kerstens, Hindrik HD
Van Hooft, Pim
Crooijmans, Richard PMA
Van Der Poel, Jan J
Elmberg, Johan
Vignal, Alain
Huang, Yinhua
Li, Ning
Prins, Herbert HT
Groenen, Martien AM
author_facet Kraus, Robert HS
Kerstens, Hindrik HD
Van Hooft, Pim
Crooijmans, Richard PMA
Van Der Poel, Jan J
Elmberg, Johan
Vignal, Alain
Huang, Yinhua
Li, Ning
Prins, Herbert HT
Groenen, Martien AM
author_sort Kraus, Robert HS
collection PubMed
description BACKGROUND: Next generation sequencing technologies allow to obtain at low cost the genomic sequence information that currently lacks for most economically and ecologically important organisms. For the mallard duck genomic data is limited. The mallard is, besides a species of large agricultural and societal importance, also the focal species when it comes to long distance dispersal of Avian Influenza. For large scale identification of SNPs we performed Illumina sequencing of wild mallard DNA and compared our data with ongoing genome and EST sequencing of domesticated conspecifics. This is the first study of its kind for waterfowl. RESULTS: More than one billion base pairs of sequence information were generated resulting in a 16× coverage of a reduced representation library of the mallard genome. Sequence reads were aligned to a draft domesticated duck reference genome and allowed for the detection of over 122,000 SNPs within our mallard sequence dataset. In addition, almost 62,000 nucleotide positions on the domesticated duck reference showed a different nucleotide compared to wild mallard. Approximately 20,000 SNPs identified within our data were shared with SNPs identified in the sequenced domestic duck or in EST sequencing projects. The shared SNPs were considered to be highly reliable and were used to benchmark non-shared SNPs for quality. Genotyping of a representative sample of 364 SNPs resulted in a SNP conversion rate of 99.7%. The correlation of the minor allele count and observed minor allele frequency in the SNP discovery pool was 0.72. CONCLUSION: We identified almost 150,000 SNPs in wild mallards that will likely yield good results in genotyping. Of these, ~101,000 SNPs were detected within our wild mallard sequences and ~49,000 were detected between wild and domesticated duck data. In the ~101,000 SNPs we found a subset of ~20,000 SNPs shared between wild mallards and the sequenced domesticated duck suggesting a low genetic divergence. Comparison of quality metrics between the total SNP set (122,000 + 62,000 = 184,000 SNPs) and the validated subset shows similar characteristics for both sets. This indicates that we have detected a large amount (~150,000) of accurately inferred mallard SNPs, which will benefit bird evolutionary studies, ecological studies (e.g. disentangling migratory connectivity) and industrial breeding programs.
format Text
id pubmed-3065436
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-30654362011-03-29 Genome wide SNP discovery, analysis and evaluation in mallard (Anas platyrhynchos) Kraus, Robert HS Kerstens, Hindrik HD Van Hooft, Pim Crooijmans, Richard PMA Van Der Poel, Jan J Elmberg, Johan Vignal, Alain Huang, Yinhua Li, Ning Prins, Herbert HT Groenen, Martien AM BMC Genomics Research Article BACKGROUND: Next generation sequencing technologies allow to obtain at low cost the genomic sequence information that currently lacks for most economically and ecologically important organisms. For the mallard duck genomic data is limited. The mallard is, besides a species of large agricultural and societal importance, also the focal species when it comes to long distance dispersal of Avian Influenza. For large scale identification of SNPs we performed Illumina sequencing of wild mallard DNA and compared our data with ongoing genome and EST sequencing of domesticated conspecifics. This is the first study of its kind for waterfowl. RESULTS: More than one billion base pairs of sequence information were generated resulting in a 16× coverage of a reduced representation library of the mallard genome. Sequence reads were aligned to a draft domesticated duck reference genome and allowed for the detection of over 122,000 SNPs within our mallard sequence dataset. In addition, almost 62,000 nucleotide positions on the domesticated duck reference showed a different nucleotide compared to wild mallard. Approximately 20,000 SNPs identified within our data were shared with SNPs identified in the sequenced domestic duck or in EST sequencing projects. The shared SNPs were considered to be highly reliable and were used to benchmark non-shared SNPs for quality. Genotyping of a representative sample of 364 SNPs resulted in a SNP conversion rate of 99.7%. The correlation of the minor allele count and observed minor allele frequency in the SNP discovery pool was 0.72. CONCLUSION: We identified almost 150,000 SNPs in wild mallards that will likely yield good results in genotyping. Of these, ~101,000 SNPs were detected within our wild mallard sequences and ~49,000 were detected between wild and domesticated duck data. In the ~101,000 SNPs we found a subset of ~20,000 SNPs shared between wild mallards and the sequenced domesticated duck suggesting a low genetic divergence. Comparison of quality metrics between the total SNP set (122,000 + 62,000 = 184,000 SNPs) and the validated subset shows similar characteristics for both sets. This indicates that we have detected a large amount (~150,000) of accurately inferred mallard SNPs, which will benefit bird evolutionary studies, ecological studies (e.g. disentangling migratory connectivity) and industrial breeding programs. BioMed Central 2011-03-16 /pmc/articles/PMC3065436/ /pubmed/21410945 http://dx.doi.org/10.1186/1471-2164-12-150 Text en Copyright ©2011 Kraus et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Kraus, Robert HS
Kerstens, Hindrik HD
Van Hooft, Pim
Crooijmans, Richard PMA
Van Der Poel, Jan J
Elmberg, Johan
Vignal, Alain
Huang, Yinhua
Li, Ning
Prins, Herbert HT
Groenen, Martien AM
Genome wide SNP discovery, analysis and evaluation in mallard (Anas platyrhynchos)
title Genome wide SNP discovery, analysis and evaluation in mallard (Anas platyrhynchos)
title_full Genome wide SNP discovery, analysis and evaluation in mallard (Anas platyrhynchos)
title_fullStr Genome wide SNP discovery, analysis and evaluation in mallard (Anas platyrhynchos)
title_full_unstemmed Genome wide SNP discovery, analysis and evaluation in mallard (Anas platyrhynchos)
title_short Genome wide SNP discovery, analysis and evaluation in mallard (Anas platyrhynchos)
title_sort genome wide snp discovery, analysis and evaluation in mallard (anas platyrhynchos)
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3065436/
https://www.ncbi.nlm.nih.gov/pubmed/21410945
http://dx.doi.org/10.1186/1471-2164-12-150
work_keys_str_mv AT krausroberths genomewidesnpdiscoveryanalysisandevaluationinmallardanasplatyrhynchos
AT kerstenshindrikhd genomewidesnpdiscoveryanalysisandevaluationinmallardanasplatyrhynchos
AT vanhooftpim genomewidesnpdiscoveryanalysisandevaluationinmallardanasplatyrhynchos
AT crooijmansrichardpma genomewidesnpdiscoveryanalysisandevaluationinmallardanasplatyrhynchos
AT vanderpoeljanj genomewidesnpdiscoveryanalysisandevaluationinmallardanasplatyrhynchos
AT elmbergjohan genomewidesnpdiscoveryanalysisandevaluationinmallardanasplatyrhynchos
AT vignalalain genomewidesnpdiscoveryanalysisandevaluationinmallardanasplatyrhynchos
AT huangyinhua genomewidesnpdiscoveryanalysisandevaluationinmallardanasplatyrhynchos
AT lining genomewidesnpdiscoveryanalysisandevaluationinmallardanasplatyrhynchos
AT prinsherbertht genomewidesnpdiscoveryanalysisandevaluationinmallardanasplatyrhynchos
AT groenenmartienam genomewidesnpdiscoveryanalysisandevaluationinmallardanasplatyrhynchos