Cargando…
Whole genome sequencing of 35 individuals provides insights into the genetic architecture of Korean population
BACKGROUND: Due to a significant decline in the costs associated with next-generation sequencing, it has become possible to decipher the genetic architecture of a population by sequencing a large number of individuals to a deep coverage. The Korean Personal Genomes Project (KPGP) recently sequenced...
Autores principales: | , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4251052/ https://www.ncbi.nlm.nih.gov/pubmed/25350283 http://dx.doi.org/10.1186/1471-2105-15-S11-S6 |
_version_ | 1782346995235553280 |
---|---|
author | Zhang, Wenqian Meehan, Joe Su, Zhenqiang Ng, Hui Wen Shu, Mao Luo, Heng Ge, Weigong Perkins, Roger Tong, Weida Hong, Huixiao |
author_facet | Zhang, Wenqian Meehan, Joe Su, Zhenqiang Ng, Hui Wen Shu, Mao Luo, Heng Ge, Weigong Perkins, Roger Tong, Weida Hong, Huixiao |
author_sort | Zhang, Wenqian |
collection | PubMed |
description | BACKGROUND: Due to a significant decline in the costs associated with next-generation sequencing, it has become possible to decipher the genetic architecture of a population by sequencing a large number of individuals to a deep coverage. The Korean Personal Genomes Project (KPGP) recently sequenced 35 Korean genomes at high coverage using the Illumina Hiseq platform and made the deep sequencing data publicly available, providing the scientific community opportunities to decipher the genetic architecture of the Korean population. METHODS: In this study, we used two single nucleotide variant (SNV) calling pipelines: mapping the raw reads obtained from whole genome sequencing of 35 Korean individuals in KPGP using BWA and SOAP2 followed by SNV calling using SAMtools and SOAPsnp, respectively. The consensus SNVs obtained from the two SNV pipelines were used to represent the SNVs of the Korean population. We compared these SNVs to those from 17 other populations provided by the HapMap consortium and the 1000 Genomes Project (1KGP) and identified SNVs that were only present in the Korean population. We studied the mutation spectrum and analyzed the genes of non-synonymous SNVs only detected in the Korean population. RESULTS: We detected a total of 8,555,726 SNVs in the 35 Korean individuals and identified 1,213,613 SNVs detected in at least one Korean individual (SNV-1) and 12,640 in all of 35 Korean individuals (SNV-35) but not in 17 other populations. In contrast with the SNVs common to other populations in HapMap and 1KGP, the Korean only SNVs had high percentages of non-silent variants, emphasizing the unique roles of these Korean only SNVs in the Korean population. Specifically, we identified 8,361 non-synonymous Korean only SNVs, of which 58 SNVs existed in all 35 Korean individuals. The 5,754 genes of non-synonymous Korean only SNVs were highly enriched in some metabolic pathways. We found adhesion is the top disease term associated with SNV-1 and Nelson syndrome is the only disease term associated with SNV-35. We found that a significant number of Korean only SNVs are in genes that are associated with the drug term of adenosine. CONCLUSION: We identified the SNVs that were found in the Korean population but not seen in other populations, and explored the corresponding genes and pathways as well as the associated disease terms and drug terms. The results expand our knowledge of the genetic architecture of the Korean population, which will benefit the implementation of personalized medicine for the Korean population. |
format | Online Article Text |
id | pubmed-4251052 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-42510522014-12-04 Whole genome sequencing of 35 individuals provides insights into the genetic architecture of Korean population Zhang, Wenqian Meehan, Joe Su, Zhenqiang Ng, Hui Wen Shu, Mao Luo, Heng Ge, Weigong Perkins, Roger Tong, Weida Hong, Huixiao BMC Bioinformatics Proceedings BACKGROUND: Due to a significant decline in the costs associated with next-generation sequencing, it has become possible to decipher the genetic architecture of a population by sequencing a large number of individuals to a deep coverage. The Korean Personal Genomes Project (KPGP) recently sequenced 35 Korean genomes at high coverage using the Illumina Hiseq platform and made the deep sequencing data publicly available, providing the scientific community opportunities to decipher the genetic architecture of the Korean population. METHODS: In this study, we used two single nucleotide variant (SNV) calling pipelines: mapping the raw reads obtained from whole genome sequencing of 35 Korean individuals in KPGP using BWA and SOAP2 followed by SNV calling using SAMtools and SOAPsnp, respectively. The consensus SNVs obtained from the two SNV pipelines were used to represent the SNVs of the Korean population. We compared these SNVs to those from 17 other populations provided by the HapMap consortium and the 1000 Genomes Project (1KGP) and identified SNVs that were only present in the Korean population. We studied the mutation spectrum and analyzed the genes of non-synonymous SNVs only detected in the Korean population. RESULTS: We detected a total of 8,555,726 SNVs in the 35 Korean individuals and identified 1,213,613 SNVs detected in at least one Korean individual (SNV-1) and 12,640 in all of 35 Korean individuals (SNV-35) but not in 17 other populations. In contrast with the SNVs common to other populations in HapMap and 1KGP, the Korean only SNVs had high percentages of non-silent variants, emphasizing the unique roles of these Korean only SNVs in the Korean population. Specifically, we identified 8,361 non-synonymous Korean only SNVs, of which 58 SNVs existed in all 35 Korean individuals. The 5,754 genes of non-synonymous Korean only SNVs were highly enriched in some metabolic pathways. We found adhesion is the top disease term associated with SNV-1 and Nelson syndrome is the only disease term associated with SNV-35. We found that a significant number of Korean only SNVs are in genes that are associated with the drug term of adenosine. CONCLUSION: We identified the SNVs that were found in the Korean population but not seen in other populations, and explored the corresponding genes and pathways as well as the associated disease terms and drug terms. The results expand our knowledge of the genetic architecture of the Korean population, which will benefit the implementation of personalized medicine for the Korean population. BioMed Central 2014-10-21 /pmc/articles/PMC4251052/ /pubmed/25350283 http://dx.doi.org/10.1186/1471-2105-15-S11-S6 Text en Copyright © 2014 Zhang et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/4.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Proceedings Zhang, Wenqian Meehan, Joe Su, Zhenqiang Ng, Hui Wen Shu, Mao Luo, Heng Ge, Weigong Perkins, Roger Tong, Weida Hong, Huixiao Whole genome sequencing of 35 individuals provides insights into the genetic architecture of Korean population |
title | Whole genome sequencing of 35 individuals provides insights into the genetic architecture of Korean population |
title_full | Whole genome sequencing of 35 individuals provides insights into the genetic architecture of Korean population |
title_fullStr | Whole genome sequencing of 35 individuals provides insights into the genetic architecture of Korean population |
title_full_unstemmed | Whole genome sequencing of 35 individuals provides insights into the genetic architecture of Korean population |
title_short | Whole genome sequencing of 35 individuals provides insights into the genetic architecture of Korean population |
title_sort | whole genome sequencing of 35 individuals provides insights into the genetic architecture of korean population |
topic | Proceedings |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4251052/ https://www.ncbi.nlm.nih.gov/pubmed/25350283 http://dx.doi.org/10.1186/1471-2105-15-S11-S6 |
work_keys_str_mv | AT zhangwenqian wholegenomesequencingof35individualsprovidesinsightsintothegeneticarchitectureofkoreanpopulation AT meehanjoe wholegenomesequencingof35individualsprovidesinsightsintothegeneticarchitectureofkoreanpopulation AT suzhenqiang wholegenomesequencingof35individualsprovidesinsightsintothegeneticarchitectureofkoreanpopulation AT nghuiwen wholegenomesequencingof35individualsprovidesinsightsintothegeneticarchitectureofkoreanpopulation AT shumao wholegenomesequencingof35individualsprovidesinsightsintothegeneticarchitectureofkoreanpopulation AT luoheng wholegenomesequencingof35individualsprovidesinsightsintothegeneticarchitectureofkoreanpopulation AT geweigong wholegenomesequencingof35individualsprovidesinsightsintothegeneticarchitectureofkoreanpopulation AT perkinsroger wholegenomesequencingof35individualsprovidesinsightsintothegeneticarchitectureofkoreanpopulation AT tongweida wholegenomesequencingof35individualsprovidesinsightsintothegeneticarchitectureofkoreanpopulation AT honghuixiao wholegenomesequencingof35individualsprovidesinsightsintothegeneticarchitectureofkoreanpopulation |