Cargando…

dDocent: a RADseq, variant-calling pipeline designed for population genomics of non-model organisms

Restriction-site associated DNA sequencing (RADseq) has become a powerful and useful approach for population genomics. Currently, no software exists that utilizes both paired-end reads from RADseq data to efficiently produce population-informative variant calls, especially for non-model organisms wi...

Descripción completa

Detalles Bibliográficos
Autores principales: Puritz, Jonathan B., Hollenbeck, Christopher M., Gold, John R.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: PeerJ Inc. 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4060032/
https://www.ncbi.nlm.nih.gov/pubmed/24949246
http://dx.doi.org/10.7717/peerj.431
_version_ 1782321310314004480
author Puritz, Jonathan B.
Hollenbeck, Christopher M.
Gold, John R.
author_facet Puritz, Jonathan B.
Hollenbeck, Christopher M.
Gold, John R.
author_sort Puritz, Jonathan B.
collection PubMed
description Restriction-site associated DNA sequencing (RADseq) has become a powerful and useful approach for population genomics. Currently, no software exists that utilizes both paired-end reads from RADseq data to efficiently produce population-informative variant calls, especially for non-model organisms with large effective population sizes and high levels of genetic polymorphism. dDocent is an analysis pipeline with a user-friendly, command-line interface designed to process individually barcoded RADseq data (with double cut sites) into informative SNPs/Indels for population-level analyses. The pipeline, written in BASH, uses data reduction techniques and other stand-alone software packages to perform quality trimming and adapter removal, de novo assembly of RAD loci, read mapping, SNP and Indel calling, and baseline data filtering. Double-digest RAD data from population pairings of three different marine fishes were used to compare dDocent with Stacks, the first generally available, widely used pipeline for analysis of RADseq data. dDocent consistently identified more SNPs shared across greater numbers of individuals and with higher levels of coverage. This is due to the fact that dDocent quality trims instead of filtering, incorporates both forward and reverse reads (including reads with INDEL polymorphisms) in assembly, mapping, and SNP calling. The pipeline and a comprehensive user guide can be found at http://dDocent.wordpress.com.
format Online
Article
Text
id pubmed-4060032
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher PeerJ Inc.
record_format MEDLINE/PubMed
spelling pubmed-40600322014-06-19 dDocent: a RADseq, variant-calling pipeline designed for population genomics of non-model organisms Puritz, Jonathan B. Hollenbeck, Christopher M. Gold, John R. PeerJ Bioinformatics Restriction-site associated DNA sequencing (RADseq) has become a powerful and useful approach for population genomics. Currently, no software exists that utilizes both paired-end reads from RADseq data to efficiently produce population-informative variant calls, especially for non-model organisms with large effective population sizes and high levels of genetic polymorphism. dDocent is an analysis pipeline with a user-friendly, command-line interface designed to process individually barcoded RADseq data (with double cut sites) into informative SNPs/Indels for population-level analyses. The pipeline, written in BASH, uses data reduction techniques and other stand-alone software packages to perform quality trimming and adapter removal, de novo assembly of RAD loci, read mapping, SNP and Indel calling, and baseline data filtering. Double-digest RAD data from population pairings of three different marine fishes were used to compare dDocent with Stacks, the first generally available, widely used pipeline for analysis of RADseq data. dDocent consistently identified more SNPs shared across greater numbers of individuals and with higher levels of coverage. This is due to the fact that dDocent quality trims instead of filtering, incorporates both forward and reverse reads (including reads with INDEL polymorphisms) in assembly, mapping, and SNP calling. The pipeline and a comprehensive user guide can be found at http://dDocent.wordpress.com. PeerJ Inc. 2014-06-10 /pmc/articles/PMC4060032/ /pubmed/24949246 http://dx.doi.org/10.7717/peerj.431 Text en © 2014 Puritz et al. http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited.
spellingShingle Bioinformatics
Puritz, Jonathan B.
Hollenbeck, Christopher M.
Gold, John R.
dDocent: a RADseq, variant-calling pipeline designed for population genomics of non-model organisms
title dDocent: a RADseq, variant-calling pipeline designed for population genomics of non-model organisms
title_full dDocent: a RADseq, variant-calling pipeline designed for population genomics of non-model organisms
title_fullStr dDocent: a RADseq, variant-calling pipeline designed for population genomics of non-model organisms
title_full_unstemmed dDocent: a RADseq, variant-calling pipeline designed for population genomics of non-model organisms
title_short dDocent: a RADseq, variant-calling pipeline designed for population genomics of non-model organisms
title_sort ddocent: a radseq, variant-calling pipeline designed for population genomics of non-model organisms
topic Bioinformatics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4060032/
https://www.ncbi.nlm.nih.gov/pubmed/24949246
http://dx.doi.org/10.7717/peerj.431
work_keys_str_mv AT puritzjonathanb ddocentaradseqvariantcallingpipelinedesignedforpopulationgenomicsofnonmodelorganisms
AT hollenbeckchristopherm ddocentaradseqvariantcallingpipelinedesignedforpopulationgenomicsofnonmodelorganisms
AT goldjohnr ddocentaradseqvariantcallingpipelinedesignedforpopulationgenomicsofnonmodelorganisms