Cargando…
dDocent: a RADseq, variant-calling pipeline designed for population genomics of non-model organisms
Restriction-site associated DNA sequencing (RADseq) has become a powerful and useful approach for population genomics. Currently, no software exists that utilizes both paired-end reads from RADseq data to efficiently produce population-informative variant calls, especially for non-model organisms wi...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
PeerJ Inc.
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4060032/ https://www.ncbi.nlm.nih.gov/pubmed/24949246 http://dx.doi.org/10.7717/peerj.431 |
_version_ | 1782321310314004480 |
---|---|
author | Puritz, Jonathan B. Hollenbeck, Christopher M. Gold, John R. |
author_facet | Puritz, Jonathan B. Hollenbeck, Christopher M. Gold, John R. |
author_sort | Puritz, Jonathan B. |
collection | PubMed |
description | Restriction-site associated DNA sequencing (RADseq) has become a powerful and useful approach for population genomics. Currently, no software exists that utilizes both paired-end reads from RADseq data to efficiently produce population-informative variant calls, especially for non-model organisms with large effective population sizes and high levels of genetic polymorphism. dDocent is an analysis pipeline with a user-friendly, command-line interface designed to process individually barcoded RADseq data (with double cut sites) into informative SNPs/Indels for population-level analyses. The pipeline, written in BASH, uses data reduction techniques and other stand-alone software packages to perform quality trimming and adapter removal, de novo assembly of RAD loci, read mapping, SNP and Indel calling, and baseline data filtering. Double-digest RAD data from population pairings of three different marine fishes were used to compare dDocent with Stacks, the first generally available, widely used pipeline for analysis of RADseq data. dDocent consistently identified more SNPs shared across greater numbers of individuals and with higher levels of coverage. This is due to the fact that dDocent quality trims instead of filtering, incorporates both forward and reverse reads (including reads with INDEL polymorphisms) in assembly, mapping, and SNP calling. The pipeline and a comprehensive user guide can be found at http://dDocent.wordpress.com. |
format | Online Article Text |
id | pubmed-4060032 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | PeerJ Inc. |
record_format | MEDLINE/PubMed |
spelling | pubmed-40600322014-06-19 dDocent: a RADseq, variant-calling pipeline designed for population genomics of non-model organisms Puritz, Jonathan B. Hollenbeck, Christopher M. Gold, John R. PeerJ Bioinformatics Restriction-site associated DNA sequencing (RADseq) has become a powerful and useful approach for population genomics. Currently, no software exists that utilizes both paired-end reads from RADseq data to efficiently produce population-informative variant calls, especially for non-model organisms with large effective population sizes and high levels of genetic polymorphism. dDocent is an analysis pipeline with a user-friendly, command-line interface designed to process individually barcoded RADseq data (with double cut sites) into informative SNPs/Indels for population-level analyses. The pipeline, written in BASH, uses data reduction techniques and other stand-alone software packages to perform quality trimming and adapter removal, de novo assembly of RAD loci, read mapping, SNP and Indel calling, and baseline data filtering. Double-digest RAD data from population pairings of three different marine fishes were used to compare dDocent with Stacks, the first generally available, widely used pipeline for analysis of RADseq data. dDocent consistently identified more SNPs shared across greater numbers of individuals and with higher levels of coverage. This is due to the fact that dDocent quality trims instead of filtering, incorporates both forward and reverse reads (including reads with INDEL polymorphisms) in assembly, mapping, and SNP calling. The pipeline and a comprehensive user guide can be found at http://dDocent.wordpress.com. PeerJ Inc. 2014-06-10 /pmc/articles/PMC4060032/ /pubmed/24949246 http://dx.doi.org/10.7717/peerj.431 Text en © 2014 Puritz et al. http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited. |
spellingShingle | Bioinformatics Puritz, Jonathan B. Hollenbeck, Christopher M. Gold, John R. dDocent: a RADseq, variant-calling pipeline designed for population genomics of non-model organisms |
title | dDocent: a RADseq, variant-calling pipeline designed for population genomics of non-model organisms |
title_full | dDocent: a RADseq, variant-calling pipeline designed for population genomics of non-model organisms |
title_fullStr | dDocent: a RADseq, variant-calling pipeline designed for population genomics of non-model organisms |
title_full_unstemmed | dDocent: a RADseq, variant-calling pipeline designed for population genomics of non-model organisms |
title_short | dDocent: a RADseq, variant-calling pipeline designed for population genomics of non-model organisms |
title_sort | ddocent: a radseq, variant-calling pipeline designed for population genomics of non-model organisms |
topic | Bioinformatics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4060032/ https://www.ncbi.nlm.nih.gov/pubmed/24949246 http://dx.doi.org/10.7717/peerj.431 |
work_keys_str_mv | AT puritzjonathanb ddocentaradseqvariantcallingpipelinedesignedforpopulationgenomicsofnonmodelorganisms AT hollenbeckchristopherm ddocentaradseqvariantcallingpipelinedesignedforpopulationgenomicsofnonmodelorganisms AT goldjohnr ddocentaradseqvariantcallingpipelinedesignedforpopulationgenomicsofnonmodelorganisms |