Cargando…

Amplification Biases and Consistent Recovery of Loci in a Double-Digest RAD-seq Protocol

A growing variety of “genotype-by-sequencing” (GBS) methods use restriction enzymes and high throughput DNA sequencing to generate data for a subset of genomic loci, allowing the simultaneous discovery and genotyping of thousands of polymorphisms in a set of multiplexed samples. We evaluated a “doub...

Descripción completa

Detalles Bibliográficos
Autores principales: DaCosta, Jeffrey M., Sorenson, Michael D.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4154734/
https://www.ncbi.nlm.nih.gov/pubmed/25188270
http://dx.doi.org/10.1371/journal.pone.0106713
_version_ 1782333465618808832
author DaCosta, Jeffrey M.
Sorenson, Michael D.
author_facet DaCosta, Jeffrey M.
Sorenson, Michael D.
author_sort DaCosta, Jeffrey M.
collection PubMed
description A growing variety of “genotype-by-sequencing” (GBS) methods use restriction enzymes and high throughput DNA sequencing to generate data for a subset of genomic loci, allowing the simultaneous discovery and genotyping of thousands of polymorphisms in a set of multiplexed samples. We evaluated a “double-digest” restriction-site associated DNA sequencing (ddRAD-seq) protocol by 1) comparing results for a zebra finch (Taeniopygia guttata) sample with in silico predictions from the zebra finch reference genome; 2) assessing data quality for a population sample of indigobirds (Vidua spp.); and 3) testing for consistent recovery of loci across multiple samples and sequencing runs. Comparison with in silico predictions revealed that 1) over 90% of predicted, single-copy loci in our targeted size range (178–328 bp) were recovered; 2) short restriction fragments (38–178 bp) were carried through the size selection step and sequenced at appreciable depth, generating unexpected but nonetheless useful data; 3) amplification bias favored shorter, GC-rich fragments, contributing to among locus variation in sequencing depth that was strongly correlated across samples; 4) our use of restriction enzymes with a GC-rich recognition sequence resulted in an up to four-fold overrepresentation of GC-rich portions of the genome; and 5) star activity (i.e., non-specific cutting) resulted in thousands of “extra” loci sequenced at low depth. Results for three species of indigobirds show that a common set of thousands of loci can be consistently recovered across both individual samples and sequencing runs. In a run with 46 samples, we genotyped 5,996 loci in all individuals and 9,833 loci in 42 or more individuals, resulting in <1% missing data for the larger data set. We compare our approach to similar methods and discuss the range of factors (fragment library preparation, natural genetic variation, bioinformatics) influencing the recovery of a consistent set of loci among samples.
format Online
Article
Text
id pubmed-4154734
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-41547342014-09-08 Amplification Biases and Consistent Recovery of Loci in a Double-Digest RAD-seq Protocol DaCosta, Jeffrey M. Sorenson, Michael D. PLoS One Research Article A growing variety of “genotype-by-sequencing” (GBS) methods use restriction enzymes and high throughput DNA sequencing to generate data for a subset of genomic loci, allowing the simultaneous discovery and genotyping of thousands of polymorphisms in a set of multiplexed samples. We evaluated a “double-digest” restriction-site associated DNA sequencing (ddRAD-seq) protocol by 1) comparing results for a zebra finch (Taeniopygia guttata) sample with in silico predictions from the zebra finch reference genome; 2) assessing data quality for a population sample of indigobirds (Vidua spp.); and 3) testing for consistent recovery of loci across multiple samples and sequencing runs. Comparison with in silico predictions revealed that 1) over 90% of predicted, single-copy loci in our targeted size range (178–328 bp) were recovered; 2) short restriction fragments (38–178 bp) were carried through the size selection step and sequenced at appreciable depth, generating unexpected but nonetheless useful data; 3) amplification bias favored shorter, GC-rich fragments, contributing to among locus variation in sequencing depth that was strongly correlated across samples; 4) our use of restriction enzymes with a GC-rich recognition sequence resulted in an up to four-fold overrepresentation of GC-rich portions of the genome; and 5) star activity (i.e., non-specific cutting) resulted in thousands of “extra” loci sequenced at low depth. Results for three species of indigobirds show that a common set of thousands of loci can be consistently recovered across both individual samples and sequencing runs. In a run with 46 samples, we genotyped 5,996 loci in all individuals and 9,833 loci in 42 or more individuals, resulting in <1% missing data for the larger data set. We compare our approach to similar methods and discuss the range of factors (fragment library preparation, natural genetic variation, bioinformatics) influencing the recovery of a consistent set of loci among samples. Public Library of Science 2014-09-04 /pmc/articles/PMC4154734/ /pubmed/25188270 http://dx.doi.org/10.1371/journal.pone.0106713 Text en © 2014 DaCosta, Sorenson http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
DaCosta, Jeffrey M.
Sorenson, Michael D.
Amplification Biases and Consistent Recovery of Loci in a Double-Digest RAD-seq Protocol
title Amplification Biases and Consistent Recovery of Loci in a Double-Digest RAD-seq Protocol
title_full Amplification Biases and Consistent Recovery of Loci in a Double-Digest RAD-seq Protocol
title_fullStr Amplification Biases and Consistent Recovery of Loci in a Double-Digest RAD-seq Protocol
title_full_unstemmed Amplification Biases and Consistent Recovery of Loci in a Double-Digest RAD-seq Protocol
title_short Amplification Biases and Consistent Recovery of Loci in a Double-Digest RAD-seq Protocol
title_sort amplification biases and consistent recovery of loci in a double-digest rad-seq protocol
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4154734/
https://www.ncbi.nlm.nih.gov/pubmed/25188270
http://dx.doi.org/10.1371/journal.pone.0106713
work_keys_str_mv AT dacostajeffreym amplificationbiasesandconsistentrecoveryoflociinadoubledigestradseqprotocol
AT sorensonmichaeld amplificationbiasesandconsistentrecoveryoflociinadoubledigestradseqprotocol