Cargando…
High-quality genetic mapping with ddRADseq in the non-model tree Quercus rubra
BACKGROUND: Restriction site associated DNA sequencing (RADseq) has the potential to be a broadly applicable, low-cost approach for high-quality genetic linkage mapping in forest trees lacking a reference genome. The statistical inference of linear order must be as accurate as possible for the corre...
Autores principales: | , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5450186/ https://www.ncbi.nlm.nih.gov/pubmed/28558688 http://dx.doi.org/10.1186/s12864-017-3765-8 |
_version_ | 1783239916097896448 |
---|---|
author | Konar, Arpita Choudhury, Olivia Bullis, Rebecca Fiedler, Lauren Kruser, Jacqueline M. Stephens, Melissa T. Gailing, Oliver Schlarbaum, Scott Coggeshall, Mark V. Staton, Margaret E. Carlson, John E. Emrich, Scott Romero-Severson, Jeanne |
author_facet | Konar, Arpita Choudhury, Olivia Bullis, Rebecca Fiedler, Lauren Kruser, Jacqueline M. Stephens, Melissa T. Gailing, Oliver Schlarbaum, Scott Coggeshall, Mark V. Staton, Margaret E. Carlson, John E. Emrich, Scott Romero-Severson, Jeanne |
author_sort | Konar, Arpita |
collection | PubMed |
description | BACKGROUND: Restriction site associated DNA sequencing (RADseq) has the potential to be a broadly applicable, low-cost approach for high-quality genetic linkage mapping in forest trees lacking a reference genome. The statistical inference of linear order must be as accurate as possible for the correct ordering of sequence scaffolds and contigs to chromosomal locations. Accurate maps also facilitate the discovery of chromosome segments containing allelic variants conferring resistance to the biotic and abiotic stresses that threaten forest trees worldwide. We used ddRADseq for genetic mapping in the tree Quercus rubra, with an approach optimized to produce a high-quality map. Our study design also enabled us to model the results we would have obtained with less depth of coverage. RESULTS: Our sequencing design produced a high sequencing depth in the parents (248×) and a moderate sequencing depth (15×) in the progeny. The digital normalization method of generating a de novo reference and the SAMtools SNP variant caller yielded the most SNP calls (78,725). The major drivers of map inflation were multiple SNPs located within the same sequence (77% of SNPs called). The highest quality map was generated with a low level of missing data (5%) and a genome-wide threshold of 0.025 for deviation from Mendelian expectation. The final map included 849 SNP markers (1.8% of the 78,725 SNPs called). Downsampling the individual FASTQ files to model lower depth of coverage revealed that sequencing the progeny using 96 samples per lane would have yielded too few SNP markers to generate a map, even if we had sequenced the parents at depth 248×. CONCLUSIONS: The ddRADseq technology produced enough high-quality SNP markers to make a moderately dense, high-quality map. The success of this project was due to high depth of coverage of the parents, moderate depth of coverage of the progeny, a good framework map, an optimized bioinformatics pipeline, and rigorous premapping filters. The ddRADseq approach is useful for the construction of high-quality genetic maps in organisms lacking a reference genome if the parents and progeny are sequenced at sufficient depth. Technical improvements in reduced representation sequencing (RRS) approaches are needed to reduce the amount of missing data. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12864-017-3765-8) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-5450186 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-54501862017-06-01 High-quality genetic mapping with ddRADseq in the non-model tree Quercus rubra Konar, Arpita Choudhury, Olivia Bullis, Rebecca Fiedler, Lauren Kruser, Jacqueline M. Stephens, Melissa T. Gailing, Oliver Schlarbaum, Scott Coggeshall, Mark V. Staton, Margaret E. Carlson, John E. Emrich, Scott Romero-Severson, Jeanne BMC Genomics Research Article BACKGROUND: Restriction site associated DNA sequencing (RADseq) has the potential to be a broadly applicable, low-cost approach for high-quality genetic linkage mapping in forest trees lacking a reference genome. The statistical inference of linear order must be as accurate as possible for the correct ordering of sequence scaffolds and contigs to chromosomal locations. Accurate maps also facilitate the discovery of chromosome segments containing allelic variants conferring resistance to the biotic and abiotic stresses that threaten forest trees worldwide. We used ddRADseq for genetic mapping in the tree Quercus rubra, with an approach optimized to produce a high-quality map. Our study design also enabled us to model the results we would have obtained with less depth of coverage. RESULTS: Our sequencing design produced a high sequencing depth in the parents (248×) and a moderate sequencing depth (15×) in the progeny. The digital normalization method of generating a de novo reference and the SAMtools SNP variant caller yielded the most SNP calls (78,725). The major drivers of map inflation were multiple SNPs located within the same sequence (77% of SNPs called). The highest quality map was generated with a low level of missing data (5%) and a genome-wide threshold of 0.025 for deviation from Mendelian expectation. The final map included 849 SNP markers (1.8% of the 78,725 SNPs called). Downsampling the individual FASTQ files to model lower depth of coverage revealed that sequencing the progeny using 96 samples per lane would have yielded too few SNP markers to generate a map, even if we had sequenced the parents at depth 248×. CONCLUSIONS: The ddRADseq technology produced enough high-quality SNP markers to make a moderately dense, high-quality map. The success of this project was due to high depth of coverage of the parents, moderate depth of coverage of the progeny, a good framework map, an optimized bioinformatics pipeline, and rigorous premapping filters. The ddRADseq approach is useful for the construction of high-quality genetic maps in organisms lacking a reference genome if the parents and progeny are sequenced at sufficient depth. Technical improvements in reduced representation sequencing (RRS) approaches are needed to reduce the amount of missing data. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12864-017-3765-8) contains supplementary material, which is available to authorized users. BioMed Central 2017-05-30 /pmc/articles/PMC5450186/ /pubmed/28558688 http://dx.doi.org/10.1186/s12864-017-3765-8 Text en © The Author(s). 2017 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Article Konar, Arpita Choudhury, Olivia Bullis, Rebecca Fiedler, Lauren Kruser, Jacqueline M. Stephens, Melissa T. Gailing, Oliver Schlarbaum, Scott Coggeshall, Mark V. Staton, Margaret E. Carlson, John E. Emrich, Scott Romero-Severson, Jeanne High-quality genetic mapping with ddRADseq in the non-model tree Quercus rubra |
title | High-quality genetic mapping with ddRADseq in the non-model tree Quercus rubra |
title_full | High-quality genetic mapping with ddRADseq in the non-model tree Quercus rubra |
title_fullStr | High-quality genetic mapping with ddRADseq in the non-model tree Quercus rubra |
title_full_unstemmed | High-quality genetic mapping with ddRADseq in the non-model tree Quercus rubra |
title_short | High-quality genetic mapping with ddRADseq in the non-model tree Quercus rubra |
title_sort | high-quality genetic mapping with ddradseq in the non-model tree quercus rubra |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5450186/ https://www.ncbi.nlm.nih.gov/pubmed/28558688 http://dx.doi.org/10.1186/s12864-017-3765-8 |
work_keys_str_mv | AT konararpita highqualitygeneticmappingwithddradseqinthenonmodeltreequercusrubra AT choudhuryolivia highqualitygeneticmappingwithddradseqinthenonmodeltreequercusrubra AT bullisrebecca highqualitygeneticmappingwithddradseqinthenonmodeltreequercusrubra AT fiedlerlauren highqualitygeneticmappingwithddradseqinthenonmodeltreequercusrubra AT kruserjacquelinem highqualitygeneticmappingwithddradseqinthenonmodeltreequercusrubra AT stephensmelissat highqualitygeneticmappingwithddradseqinthenonmodeltreequercusrubra AT gailingoliver highqualitygeneticmappingwithddradseqinthenonmodeltreequercusrubra AT schlarbaumscott highqualitygeneticmappingwithddradseqinthenonmodeltreequercusrubra AT coggeshallmarkv highqualitygeneticmappingwithddradseqinthenonmodeltreequercusrubra AT statonmargarete highqualitygeneticmappingwithddradseqinthenonmodeltreequercusrubra AT carlsonjohne highqualitygeneticmappingwithddradseqinthenonmodeltreequercusrubra AT emrichscott highqualitygeneticmappingwithddradseqinthenonmodeltreequercusrubra AT romeroseversonjeanne highqualitygeneticmappingwithddradseqinthenonmodeltreequercusrubra |