Cargando…

Whole genome sequencing of 35 individuals provides insights into the genetic architecture of Korean population

BACKGROUND: Due to a significant decline in the costs associated with next-generation sequencing, it has become possible to decipher the genetic architecture of a population by sequencing a large number of individuals to a deep coverage. The Korean Personal Genomes Project (KPGP) recently sequenced...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Wenqian, Meehan, Joe, Su, Zhenqiang, Ng, Hui Wen, Shu, Mao, Luo, Heng, Ge, Weigong, Perkins, Roger, Tong, Weida, Hong, Huixiao
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4251052/
https://www.ncbi.nlm.nih.gov/pubmed/25350283
http://dx.doi.org/10.1186/1471-2105-15-S11-S6
_version_ 1782346995235553280
author Zhang, Wenqian
Meehan, Joe
Su, Zhenqiang
Ng, Hui Wen
Shu, Mao
Luo, Heng
Ge, Weigong
Perkins, Roger
Tong, Weida
Hong, Huixiao
author_facet Zhang, Wenqian
Meehan, Joe
Su, Zhenqiang
Ng, Hui Wen
Shu, Mao
Luo, Heng
Ge, Weigong
Perkins, Roger
Tong, Weida
Hong, Huixiao
author_sort Zhang, Wenqian
collection PubMed
description BACKGROUND: Due to a significant decline in the costs associated with next-generation sequencing, it has become possible to decipher the genetic architecture of a population by sequencing a large number of individuals to a deep coverage. The Korean Personal Genomes Project (KPGP) recently sequenced 35 Korean genomes at high coverage using the Illumina Hiseq platform and made the deep sequencing data publicly available, providing the scientific community opportunities to decipher the genetic architecture of the Korean population. METHODS: In this study, we used two single nucleotide variant (SNV) calling pipelines: mapping the raw reads obtained from whole genome sequencing of 35 Korean individuals in KPGP using BWA and SOAP2 followed by SNV calling using SAMtools and SOAPsnp, respectively. The consensus SNVs obtained from the two SNV pipelines were used to represent the SNVs of the Korean population. We compared these SNVs to those from 17 other populations provided by the HapMap consortium and the 1000 Genomes Project (1KGP) and identified SNVs that were only present in the Korean population. We studied the mutation spectrum and analyzed the genes of non-synonymous SNVs only detected in the Korean population. RESULTS: We detected a total of 8,555,726 SNVs in the 35 Korean individuals and identified 1,213,613 SNVs detected in at least one Korean individual (SNV-1) and 12,640 in all of 35 Korean individuals (SNV-35) but not in 17 other populations. In contrast with the SNVs common to other populations in HapMap and 1KGP, the Korean only SNVs had high percentages of non-silent variants, emphasizing the unique roles of these Korean only SNVs in the Korean population. Specifically, we identified 8,361 non-synonymous Korean only SNVs, of which 58 SNVs existed in all 35 Korean individuals. The 5,754 genes of non-synonymous Korean only SNVs were highly enriched in some metabolic pathways. We found adhesion is the top disease term associated with SNV-1 and Nelson syndrome is the only disease term associated with SNV-35. We found that a significant number of Korean only SNVs are in genes that are associated with the drug term of adenosine. CONCLUSION: We identified the SNVs that were found in the Korean population but not seen in other populations, and explored the corresponding genes and pathways as well as the associated disease terms and drug terms. The results expand our knowledge of the genetic architecture of the Korean population, which will benefit the implementation of personalized medicine for the Korean population.
format Online
Article
Text
id pubmed-4251052
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-42510522014-12-04 Whole genome sequencing of 35 individuals provides insights into the genetic architecture of Korean population Zhang, Wenqian Meehan, Joe Su, Zhenqiang Ng, Hui Wen Shu, Mao Luo, Heng Ge, Weigong Perkins, Roger Tong, Weida Hong, Huixiao BMC Bioinformatics Proceedings BACKGROUND: Due to a significant decline in the costs associated with next-generation sequencing, it has become possible to decipher the genetic architecture of a population by sequencing a large number of individuals to a deep coverage. The Korean Personal Genomes Project (KPGP) recently sequenced 35 Korean genomes at high coverage using the Illumina Hiseq platform and made the deep sequencing data publicly available, providing the scientific community opportunities to decipher the genetic architecture of the Korean population. METHODS: In this study, we used two single nucleotide variant (SNV) calling pipelines: mapping the raw reads obtained from whole genome sequencing of 35 Korean individuals in KPGP using BWA and SOAP2 followed by SNV calling using SAMtools and SOAPsnp, respectively. The consensus SNVs obtained from the two SNV pipelines were used to represent the SNVs of the Korean population. We compared these SNVs to those from 17 other populations provided by the HapMap consortium and the 1000 Genomes Project (1KGP) and identified SNVs that were only present in the Korean population. We studied the mutation spectrum and analyzed the genes of non-synonymous SNVs only detected in the Korean population. RESULTS: We detected a total of 8,555,726 SNVs in the 35 Korean individuals and identified 1,213,613 SNVs detected in at least one Korean individual (SNV-1) and 12,640 in all of 35 Korean individuals (SNV-35) but not in 17 other populations. In contrast with the SNVs common to other populations in HapMap and 1KGP, the Korean only SNVs had high percentages of non-silent variants, emphasizing the unique roles of these Korean only SNVs in the Korean population. Specifically, we identified 8,361 non-synonymous Korean only SNVs, of which 58 SNVs existed in all 35 Korean individuals. The 5,754 genes of non-synonymous Korean only SNVs were highly enriched in some metabolic pathways. We found adhesion is the top disease term associated with SNV-1 and Nelson syndrome is the only disease term associated with SNV-35. We found that a significant number of Korean only SNVs are in genes that are associated with the drug term of adenosine. CONCLUSION: We identified the SNVs that were found in the Korean population but not seen in other populations, and explored the corresponding genes and pathways as well as the associated disease terms and drug terms. The results expand our knowledge of the genetic architecture of the Korean population, which will benefit the implementation of personalized medicine for the Korean population. BioMed Central 2014-10-21 /pmc/articles/PMC4251052/ /pubmed/25350283 http://dx.doi.org/10.1186/1471-2105-15-S11-S6 Text en Copyright © 2014 Zhang et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/4.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Proceedings
Zhang, Wenqian
Meehan, Joe
Su, Zhenqiang
Ng, Hui Wen
Shu, Mao
Luo, Heng
Ge, Weigong
Perkins, Roger
Tong, Weida
Hong, Huixiao
Whole genome sequencing of 35 individuals provides insights into the genetic architecture of Korean population
title Whole genome sequencing of 35 individuals provides insights into the genetic architecture of Korean population
title_full Whole genome sequencing of 35 individuals provides insights into the genetic architecture of Korean population
title_fullStr Whole genome sequencing of 35 individuals provides insights into the genetic architecture of Korean population
title_full_unstemmed Whole genome sequencing of 35 individuals provides insights into the genetic architecture of Korean population
title_short Whole genome sequencing of 35 individuals provides insights into the genetic architecture of Korean population
title_sort whole genome sequencing of 35 individuals provides insights into the genetic architecture of korean population
topic Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4251052/
https://www.ncbi.nlm.nih.gov/pubmed/25350283
http://dx.doi.org/10.1186/1471-2105-15-S11-S6
work_keys_str_mv AT zhangwenqian wholegenomesequencingof35individualsprovidesinsightsintothegeneticarchitectureofkoreanpopulation
AT meehanjoe wholegenomesequencingof35individualsprovidesinsightsintothegeneticarchitectureofkoreanpopulation
AT suzhenqiang wholegenomesequencingof35individualsprovidesinsightsintothegeneticarchitectureofkoreanpopulation
AT nghuiwen wholegenomesequencingof35individualsprovidesinsightsintothegeneticarchitectureofkoreanpopulation
AT shumao wholegenomesequencingof35individualsprovidesinsightsintothegeneticarchitectureofkoreanpopulation
AT luoheng wholegenomesequencingof35individualsprovidesinsightsintothegeneticarchitectureofkoreanpopulation
AT geweigong wholegenomesequencingof35individualsprovidesinsightsintothegeneticarchitectureofkoreanpopulation
AT perkinsroger wholegenomesequencingof35individualsprovidesinsightsintothegeneticarchitectureofkoreanpopulation
AT tongweida wholegenomesequencingof35individualsprovidesinsightsintothegeneticarchitectureofkoreanpopulation
AT honghuixiao wholegenomesequencingof35individualsprovidesinsightsintothegeneticarchitectureofkoreanpopulation