Cargando…

High, clustered, nucleotide diversity in the genome of Anopheles gambiae revealed through pooled-template sequencing: implications for high-throughput genotyping protocols

BACKGROUND: Association mapping approaches are dependent upon discovery and validation of single nucleotide polymorphisms (SNPs). To further association studies in Anopheles gambiae we conducted a major resequencing programme, primarily targeting regions within or close to candidate genes for insect...

Descripción completa

Detalles Bibliográficos
Autores principales: Wilding, Craig S, Weetman, David, Steen, Keith, Donnelly, Martin J
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2723138/
https://www.ncbi.nlm.nih.gov/pubmed/19607710
http://dx.doi.org/10.1186/1471-2164-10-320
_version_ 1782170363400028160
author Wilding, Craig S
Weetman, David
Steen, Keith
Donnelly, Martin J
author_facet Wilding, Craig S
Weetman, David
Steen, Keith
Donnelly, Martin J
author_sort Wilding, Craig S
collection PubMed
description BACKGROUND: Association mapping approaches are dependent upon discovery and validation of single nucleotide polymorphisms (SNPs). To further association studies in Anopheles gambiae we conducted a major resequencing programme, primarily targeting regions within or close to candidate genes for insecticide resistance. RESULTS: Using two pools of mosquito template DNA we sequenced over 300 kbp across 660 distinct amplicons of the An. gambiae genome. Comparison of SNPs identified from pooled templates with those from individual sequences revealed a very low false positive rate. False negative rates were much higher and mostly resulted from SNPs with a low minor allele frequency. Pooled-template sequencing also provided good estimates of SNP allele frequencies. Allele frequency estimation success, along with false positive and negative call rates, improved significantly when using a qualitative measure of SNP call quality. We identified a total of 7062 polymorphic features comprising 6995 SNPs and 67 indels, with, on average, a SNP every 34 bp; a high rate of polymorphism that is comparable to other studies of mosquitoes. SNPs were significantly more frequent in members of the cytochrome p450 mono-oxygenases and carboxy/cholinesterase gene-families than in glutathione-S-transferases, other detoxification genes, and control genomic regions. Polymorphic sites showed a significantly clustered distribution, but the degree of SNP clustering (independent of SNP frequency) did not vary among gene families, suggesting that clustering of polymorphisms is a general property of the An. gambiae genome. CONCLUSION: The high frequency and clustering of SNPs has important ramifications for the design of high-throughput genotyping assays based on allele specific primer extension or probe hybridisation. We illustrate these issues in the context of the design of Illumina GoldenGate assays.
format Text
id pubmed-2723138
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-27231382009-08-08 High, clustered, nucleotide diversity in the genome of Anopheles gambiae revealed through pooled-template sequencing: implications for high-throughput genotyping protocols Wilding, Craig S Weetman, David Steen, Keith Donnelly, Martin J BMC Genomics Research Article BACKGROUND: Association mapping approaches are dependent upon discovery and validation of single nucleotide polymorphisms (SNPs). To further association studies in Anopheles gambiae we conducted a major resequencing programme, primarily targeting regions within or close to candidate genes for insecticide resistance. RESULTS: Using two pools of mosquito template DNA we sequenced over 300 kbp across 660 distinct amplicons of the An. gambiae genome. Comparison of SNPs identified from pooled templates with those from individual sequences revealed a very low false positive rate. False negative rates were much higher and mostly resulted from SNPs with a low minor allele frequency. Pooled-template sequencing also provided good estimates of SNP allele frequencies. Allele frequency estimation success, along with false positive and negative call rates, improved significantly when using a qualitative measure of SNP call quality. We identified a total of 7062 polymorphic features comprising 6995 SNPs and 67 indels, with, on average, a SNP every 34 bp; a high rate of polymorphism that is comparable to other studies of mosquitoes. SNPs were significantly more frequent in members of the cytochrome p450 mono-oxygenases and carboxy/cholinesterase gene-families than in glutathione-S-transferases, other detoxification genes, and control genomic regions. Polymorphic sites showed a significantly clustered distribution, but the degree of SNP clustering (independent of SNP frequency) did not vary among gene families, suggesting that clustering of polymorphisms is a general property of the An. gambiae genome. CONCLUSION: The high frequency and clustering of SNPs has important ramifications for the design of high-throughput genotyping assays based on allele specific primer extension or probe hybridisation. We illustrate these issues in the context of the design of Illumina GoldenGate assays. BioMed Central 2009-07-16 /pmc/articles/PMC2723138/ /pubmed/19607710 http://dx.doi.org/10.1186/1471-2164-10-320 Text en Copyright © 2009 Wilding et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Wilding, Craig S
Weetman, David
Steen, Keith
Donnelly, Martin J
High, clustered, nucleotide diversity in the genome of Anopheles gambiae revealed through pooled-template sequencing: implications for high-throughput genotyping protocols
title High, clustered, nucleotide diversity in the genome of Anopheles gambiae revealed through pooled-template sequencing: implications for high-throughput genotyping protocols
title_full High, clustered, nucleotide diversity in the genome of Anopheles gambiae revealed through pooled-template sequencing: implications for high-throughput genotyping protocols
title_fullStr High, clustered, nucleotide diversity in the genome of Anopheles gambiae revealed through pooled-template sequencing: implications for high-throughput genotyping protocols
title_full_unstemmed High, clustered, nucleotide diversity in the genome of Anopheles gambiae revealed through pooled-template sequencing: implications for high-throughput genotyping protocols
title_short High, clustered, nucleotide diversity in the genome of Anopheles gambiae revealed through pooled-template sequencing: implications for high-throughput genotyping protocols
title_sort high, clustered, nucleotide diversity in the genome of anopheles gambiae revealed through pooled-template sequencing: implications for high-throughput genotyping protocols
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2723138/
https://www.ncbi.nlm.nih.gov/pubmed/19607710
http://dx.doi.org/10.1186/1471-2164-10-320
work_keys_str_mv AT wildingcraigs highclusterednucleotidediversityinthegenomeofanophelesgambiaerevealedthroughpooledtemplatesequencingimplicationsforhighthroughputgenotypingprotocols
AT weetmandavid highclusterednucleotidediversityinthegenomeofanophelesgambiaerevealedthroughpooledtemplatesequencingimplicationsforhighthroughputgenotypingprotocols
AT steenkeith highclusterednucleotidediversityinthegenomeofanophelesgambiaerevealedthroughpooledtemplatesequencingimplicationsforhighthroughputgenotypingprotocols
AT donnellymartinj highclusterednucleotidediversityinthegenomeofanophelesgambiaerevealedthroughpooledtemplatesequencingimplicationsforhighthroughputgenotypingprotocols