Cargando…

Optimal sequencing depth design for whole genome re-sequencing in pigs

BACKGROUND: As whole-genome sequencing is becoming a routine technique, it is important to identify a cost-effective depth of sequencing for such studies. However, the relationship between sequencing depth and biological results from the aspects of whole-genome coverage, variant discovery power and...

Descripción completa

Detalles Bibliográficos
Autores principales: Jiang, Yifan, Jiang, Yao, Wang, Sheng, Zhang, Qin, Ding, Xiangdong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6839175/
https://www.ncbi.nlm.nih.gov/pubmed/31703550
http://dx.doi.org/10.1186/s12859-019-3164-z
_version_ 1783467359695011840
author Jiang, Yifan
Jiang, Yao
Wang, Sheng
Zhang, Qin
Ding, Xiangdong
author_facet Jiang, Yifan
Jiang, Yao
Wang, Sheng
Zhang, Qin
Ding, Xiangdong
author_sort Jiang, Yifan
collection PubMed
description BACKGROUND: As whole-genome sequencing is becoming a routine technique, it is important to identify a cost-effective depth of sequencing for such studies. However, the relationship between sequencing depth and biological results from the aspects of whole-genome coverage, variant discovery power and the quality of variants is unclear, especially in pigs. We sequenced the genomes of three Yorkshire boars at an approximately 20X depth on the Illumina HiSeq X Ten platform and downloaded whole-genome sequencing data for three Duroc and three Landrace pigs with an approximately 20X depth for each individual. Then, we downsampled the deep genome data by extracting twelve different proportions of 0.05, 0.1, 0.15, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8 and 0.9 paired reads from the original bam files to mimic the sequence data of the same individuals at sequencing depths of 1.09X, 2.18X, 3.26X, 4.35X, 6.53X, 8.70X, 10.88X, 13.05X, 15.22X, 17.40X, 19.57X and 21.75X to evaluate the influence of genome coverage, the variant discovery rate and genotyping accuracy as a function of sequencing depth. In addition, SNP chip data for Yorkshire pigs were used as a validation for the comparison of single-sample calling and multisample calling algorithms. RESULTS: Our results indicated that 10X is an ideal practical depth for achieving plateau coverage and discovering accurate variants, which achieved greater than 99% genome coverage. The number of false-positive variants was increased dramatically at a depth of less than 4X, which covered 95% of the whole genome. In addition, the comparison of multi- and single-sample calling showed that multisample calling was more sensitive than single-sample calling, especially at lower depths. The number of variants discovered under multisample calling was 13-fold and 2-fold higher than that under single-sample calling at 1X and 22X, respectively. A large difference was observed when the depth was less than 4.38X. However, more false-positive variants were detected under multisample calling. CONCLUSIONS: Our research will inform important study design decisions regarding whole-genome sequencing depth. Our results will be helpful for choosing the appropriate depth to achieve the same power for studies performed under limited budgets.
format Online
Article
Text
id pubmed-6839175
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-68391752019-11-12 Optimal sequencing depth design for whole genome re-sequencing in pigs Jiang, Yifan Jiang, Yao Wang, Sheng Zhang, Qin Ding, Xiangdong BMC Bioinformatics Research Article BACKGROUND: As whole-genome sequencing is becoming a routine technique, it is important to identify a cost-effective depth of sequencing for such studies. However, the relationship between sequencing depth and biological results from the aspects of whole-genome coverage, variant discovery power and the quality of variants is unclear, especially in pigs. We sequenced the genomes of three Yorkshire boars at an approximately 20X depth on the Illumina HiSeq X Ten platform and downloaded whole-genome sequencing data for three Duroc and three Landrace pigs with an approximately 20X depth for each individual. Then, we downsampled the deep genome data by extracting twelve different proportions of 0.05, 0.1, 0.15, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8 and 0.9 paired reads from the original bam files to mimic the sequence data of the same individuals at sequencing depths of 1.09X, 2.18X, 3.26X, 4.35X, 6.53X, 8.70X, 10.88X, 13.05X, 15.22X, 17.40X, 19.57X and 21.75X to evaluate the influence of genome coverage, the variant discovery rate and genotyping accuracy as a function of sequencing depth. In addition, SNP chip data for Yorkshire pigs were used as a validation for the comparison of single-sample calling and multisample calling algorithms. RESULTS: Our results indicated that 10X is an ideal practical depth for achieving plateau coverage and discovering accurate variants, which achieved greater than 99% genome coverage. The number of false-positive variants was increased dramatically at a depth of less than 4X, which covered 95% of the whole genome. In addition, the comparison of multi- and single-sample calling showed that multisample calling was more sensitive than single-sample calling, especially at lower depths. The number of variants discovered under multisample calling was 13-fold and 2-fold higher than that under single-sample calling at 1X and 22X, respectively. A large difference was observed when the depth was less than 4.38X. However, more false-positive variants were detected under multisample calling. CONCLUSIONS: Our research will inform important study design decisions regarding whole-genome sequencing depth. Our results will be helpful for choosing the appropriate depth to achieve the same power for studies performed under limited budgets. BioMed Central 2019-11-08 /pmc/articles/PMC6839175/ /pubmed/31703550 http://dx.doi.org/10.1186/s12859-019-3164-z Text en © The Author(s). 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Jiang, Yifan
Jiang, Yao
Wang, Sheng
Zhang, Qin
Ding, Xiangdong
Optimal sequencing depth design for whole genome re-sequencing in pigs
title Optimal sequencing depth design for whole genome re-sequencing in pigs
title_full Optimal sequencing depth design for whole genome re-sequencing in pigs
title_fullStr Optimal sequencing depth design for whole genome re-sequencing in pigs
title_full_unstemmed Optimal sequencing depth design for whole genome re-sequencing in pigs
title_short Optimal sequencing depth design for whole genome re-sequencing in pigs
title_sort optimal sequencing depth design for whole genome re-sequencing in pigs
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6839175/
https://www.ncbi.nlm.nih.gov/pubmed/31703550
http://dx.doi.org/10.1186/s12859-019-3164-z
work_keys_str_mv AT jiangyifan optimalsequencingdepthdesignforwholegenomeresequencinginpigs
AT jiangyao optimalsequencingdepthdesignforwholegenomeresequencinginpigs
AT wangsheng optimalsequencingdepthdesignforwholegenomeresequencinginpigs
AT zhangqin optimalsequencingdepthdesignforwholegenomeresequencinginpigs
AT dingxiangdong optimalsequencingdepthdesignforwholegenomeresequencinginpigs