Cargando…

Evaluation of whole exome sequencing as an alternative to BeadChip and whole genome sequencing in human population genetic analysis

BACKGROUND: Understanding the underlying genetic structure of human populations is of fundamental interest to both biological and social sciences. Advances in high-throughput genotyping technology have markedly improved our understanding of global patterns of human genetic variation. The most widely...

Descripción completa

Detalles Bibliográficos
Autores principales: Maróti, Zoltán, Boldogkői, Zsolt, Tombácz, Dóra, Snyder, Michael, Kalmár, Tibor
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6206721/
https://www.ncbi.nlm.nih.gov/pubmed/30373510
http://dx.doi.org/10.1186/s12864-018-5168-x
_version_ 1783366407505838080
author Maróti, Zoltán
Boldogkői, Zsolt
Tombácz, Dóra
Snyder, Michael
Kalmár, Tibor
author_facet Maróti, Zoltán
Boldogkői, Zsolt
Tombácz, Dóra
Snyder, Michael
Kalmár, Tibor
author_sort Maróti, Zoltán
collection PubMed
description BACKGROUND: Understanding the underlying genetic structure of human populations is of fundamental interest to both biological and social sciences. Advances in high-throughput genotyping technology have markedly improved our understanding of global patterns of human genetic variation. The most widely used methods for collecting variant information at the DNA-level include whole genome sequencing, which remains costly, and the more economical solution of array-based techniques, as these are capable of simultaneously genotyping a pre-selected set of variable DNA sites in the human genome. The largest publicly accessible set of human genomic sequence data available today originates from exome sequencing that comprises around 1.2% of the whole genome (approximately 30 million base pairs). RESULTS: To unbiasedly compare the effect of SNP selection strategies in population genetic analysis we subsampled the variants of the same highly curated 1 K Genome dataset to mimic genome, exome sequencing and array data in order to eliminate the effect of different chemistry and error profiles of these different approaches. Next we compared the application of the exome dataset to the array-based dataset and to the gold standard whole genome dataset using the same population genetic analysis methods. CONCLUSIONS: Our results draw attention to some of the inherent problems that arise from using pre-selected SNP sets for population genetic analysis. Additionally, we demonstrate that exome sequencing provides a better alternative to the array-based methods for population genetic analysis. In this study, we propose a strategy for unbiased variant collection from exome data and offer a bioinformatics protocol for proper data processing. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12864-018-5168-x) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-6206721
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-62067212018-10-31 Evaluation of whole exome sequencing as an alternative to BeadChip and whole genome sequencing in human population genetic analysis Maróti, Zoltán Boldogkői, Zsolt Tombácz, Dóra Snyder, Michael Kalmár, Tibor BMC Genomics Methodology Article BACKGROUND: Understanding the underlying genetic structure of human populations is of fundamental interest to both biological and social sciences. Advances in high-throughput genotyping technology have markedly improved our understanding of global patterns of human genetic variation. The most widely used methods for collecting variant information at the DNA-level include whole genome sequencing, which remains costly, and the more economical solution of array-based techniques, as these are capable of simultaneously genotyping a pre-selected set of variable DNA sites in the human genome. The largest publicly accessible set of human genomic sequence data available today originates from exome sequencing that comprises around 1.2% of the whole genome (approximately 30 million base pairs). RESULTS: To unbiasedly compare the effect of SNP selection strategies in population genetic analysis we subsampled the variants of the same highly curated 1 K Genome dataset to mimic genome, exome sequencing and array data in order to eliminate the effect of different chemistry and error profiles of these different approaches. Next we compared the application of the exome dataset to the array-based dataset and to the gold standard whole genome dataset using the same population genetic analysis methods. CONCLUSIONS: Our results draw attention to some of the inherent problems that arise from using pre-selected SNP sets for population genetic analysis. Additionally, we demonstrate that exome sequencing provides a better alternative to the array-based methods for population genetic analysis. In this study, we propose a strategy for unbiased variant collection from exome data and offer a bioinformatics protocol for proper data processing. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12864-018-5168-x) contains supplementary material, which is available to authorized users. BioMed Central 2018-10-29 /pmc/articles/PMC6206721/ /pubmed/30373510 http://dx.doi.org/10.1186/s12864-018-5168-x Text en © The Author(s). 2018 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology Article
Maróti, Zoltán
Boldogkői, Zsolt
Tombácz, Dóra
Snyder, Michael
Kalmár, Tibor
Evaluation of whole exome sequencing as an alternative to BeadChip and whole genome sequencing in human population genetic analysis
title Evaluation of whole exome sequencing as an alternative to BeadChip and whole genome sequencing in human population genetic analysis
title_full Evaluation of whole exome sequencing as an alternative to BeadChip and whole genome sequencing in human population genetic analysis
title_fullStr Evaluation of whole exome sequencing as an alternative to BeadChip and whole genome sequencing in human population genetic analysis
title_full_unstemmed Evaluation of whole exome sequencing as an alternative to BeadChip and whole genome sequencing in human population genetic analysis
title_short Evaluation of whole exome sequencing as an alternative to BeadChip and whole genome sequencing in human population genetic analysis
title_sort evaluation of whole exome sequencing as an alternative to beadchip and whole genome sequencing in human population genetic analysis
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6206721/
https://www.ncbi.nlm.nih.gov/pubmed/30373510
http://dx.doi.org/10.1186/s12864-018-5168-x
work_keys_str_mv AT marotizoltan evaluationofwholeexomesequencingasanalternativetobeadchipandwholegenomesequencinginhumanpopulationgeneticanalysis
AT boldogkoizsolt evaluationofwholeexomesequencingasanalternativetobeadchipandwholegenomesequencinginhumanpopulationgeneticanalysis
AT tombaczdora evaluationofwholeexomesequencingasanalternativetobeadchipandwholegenomesequencinginhumanpopulationgeneticanalysis
AT snydermichael evaluationofwholeexomesequencingasanalternativetobeadchipandwholegenomesequencinginhumanpopulationgeneticanalysis
AT kalmartibor evaluationofwholeexomesequencingasanalternativetobeadchipandwholegenomesequencinginhumanpopulationgeneticanalysis