Cargando…

Revising a Personal Genome by Comparing and Combining Data from Two Different Sequencing Platforms

For the robust practice of genomic medicine, sequencing results must be compatible, regardless of the sequencing technologies and algorithms used. Presently, genome sequencing is still an imprecise science and is complicated by differences in the chemistry, coverage, alignment, and variant-calling a...

Descripción completa

Detalles Bibliográficos
Autores principales: Kim, Deokhoon, Kim, Woo-Yeon, Lee, Sun-Young, Lee, Sung-Yeoun, Yun, Hongseok, Shin, Soo-Yong, Lee, Jungyoun, Hong, Yoojin, Won, Youngmi, Kim, Seong-Jin, Lee, Yong Seok, Ahn, Sung-Min
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3620462/
https://www.ncbi.nlm.nih.gov/pubmed/23593254
http://dx.doi.org/10.1371/journal.pone.0060585
_version_ 1782265602446983168
author Kim, Deokhoon
Kim, Woo-Yeon
Lee, Sun-Young
Lee, Sung-Yeoun
Yun, Hongseok
Shin, Soo-Yong
Lee, Jungyoun
Hong, Yoojin
Won, Youngmi
Kim, Seong-Jin
Lee, Yong Seok
Ahn, Sung-Min
author_facet Kim, Deokhoon
Kim, Woo-Yeon
Lee, Sun-Young
Lee, Sung-Yeoun
Yun, Hongseok
Shin, Soo-Yong
Lee, Jungyoun
Hong, Yoojin
Won, Youngmi
Kim, Seong-Jin
Lee, Yong Seok
Ahn, Sung-Min
author_sort Kim, Deokhoon
collection PubMed
description For the robust practice of genomic medicine, sequencing results must be compatible, regardless of the sequencing technologies and algorithms used. Presently, genome sequencing is still an imprecise science and is complicated by differences in the chemistry, coverage, alignment, and variant-calling algorithms. We identified ∼3.33 million single nucleotide variants (SNVs) and ∼3.62 million SNVs in the SJK genome using SOLiD and Illumina data, respectively. Approximately 3 million SNVs were concordant between the two platforms while 68,532 SNVs were discordant; 219,616 SNVs were SOLiD-specific and 516,080 SNVs were Illumina-specific (i.e., platform-specific). Concordant, discordant, and platform-specific SNVs were further analyzed and characterized. Overall, a large portion of heterozygous SNVs that were discordant with genotyping calls of single nucleotide polymorphism chips were highly confident. Approximately 70% of the platform-specific SNVs were located in regions containing repetitive sequences. Such platform-specificity may arise from differences between platforms, with regard to read length (36 bp and 72 bp vs. 50 bp), insert size (∼100–300 bp vs. ∼1–2 kb), sequencing chemistry (sequencing-by-synthesis using single nucleotides vs. ligation-based sequencing using oligomers), and sequencing quality. When data from the two platforms were merged for variant calling, the proportion of callable regions of the reference genome increased to 99.66%, which was 1.43% higher than the average callability of the two platforms, representing ∼40 million bases. In this study, we compared the differences in sequencing results between two sequencing platforms. Approximately 90% of the SNVs were concordant between the two platforms, yet ∼10% of the SNVs were either discordant or platform-specific, indicating that each platform had its own strengths and weaknesses. When data from the two platforms were merged, both the overall callability of the reference genome and the overall accuracy of the SNVs improved, demonstrating the likelihood that a re-sequenced genome can be revised using complementary data.
format Online
Article
Text
id pubmed-3620462
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-36204622013-04-16 Revising a Personal Genome by Comparing and Combining Data from Two Different Sequencing Platforms Kim, Deokhoon Kim, Woo-Yeon Lee, Sun-Young Lee, Sung-Yeoun Yun, Hongseok Shin, Soo-Yong Lee, Jungyoun Hong, Yoojin Won, Youngmi Kim, Seong-Jin Lee, Yong Seok Ahn, Sung-Min PLoS One Research Article For the robust practice of genomic medicine, sequencing results must be compatible, regardless of the sequencing technologies and algorithms used. Presently, genome sequencing is still an imprecise science and is complicated by differences in the chemistry, coverage, alignment, and variant-calling algorithms. We identified ∼3.33 million single nucleotide variants (SNVs) and ∼3.62 million SNVs in the SJK genome using SOLiD and Illumina data, respectively. Approximately 3 million SNVs were concordant between the two platforms while 68,532 SNVs were discordant; 219,616 SNVs were SOLiD-specific and 516,080 SNVs were Illumina-specific (i.e., platform-specific). Concordant, discordant, and platform-specific SNVs were further analyzed and characterized. Overall, a large portion of heterozygous SNVs that were discordant with genotyping calls of single nucleotide polymorphism chips were highly confident. Approximately 70% of the platform-specific SNVs were located in regions containing repetitive sequences. Such platform-specificity may arise from differences between platforms, with regard to read length (36 bp and 72 bp vs. 50 bp), insert size (∼100–300 bp vs. ∼1–2 kb), sequencing chemistry (sequencing-by-synthesis using single nucleotides vs. ligation-based sequencing using oligomers), and sequencing quality. When data from the two platforms were merged for variant calling, the proportion of callable regions of the reference genome increased to 99.66%, which was 1.43% higher than the average callability of the two platforms, representing ∼40 million bases. In this study, we compared the differences in sequencing results between two sequencing platforms. Approximately 90% of the SNVs were concordant between the two platforms, yet ∼10% of the SNVs were either discordant or platform-specific, indicating that each platform had its own strengths and weaknesses. When data from the two platforms were merged, both the overall callability of the reference genome and the overall accuracy of the SNVs improved, demonstrating the likelihood that a re-sequenced genome can be revised using complementary data. Public Library of Science 2013-04-08 /pmc/articles/PMC3620462/ /pubmed/23593254 http://dx.doi.org/10.1371/journal.pone.0060585 Text en © 2013 Kim et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Kim, Deokhoon
Kim, Woo-Yeon
Lee, Sun-Young
Lee, Sung-Yeoun
Yun, Hongseok
Shin, Soo-Yong
Lee, Jungyoun
Hong, Yoojin
Won, Youngmi
Kim, Seong-Jin
Lee, Yong Seok
Ahn, Sung-Min
Revising a Personal Genome by Comparing and Combining Data from Two Different Sequencing Platforms
title Revising a Personal Genome by Comparing and Combining Data from Two Different Sequencing Platforms
title_full Revising a Personal Genome by Comparing and Combining Data from Two Different Sequencing Platforms
title_fullStr Revising a Personal Genome by Comparing and Combining Data from Two Different Sequencing Platforms
title_full_unstemmed Revising a Personal Genome by Comparing and Combining Data from Two Different Sequencing Platforms
title_short Revising a Personal Genome by Comparing and Combining Data from Two Different Sequencing Platforms
title_sort revising a personal genome by comparing and combining data from two different sequencing platforms
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3620462/
https://www.ncbi.nlm.nih.gov/pubmed/23593254
http://dx.doi.org/10.1371/journal.pone.0060585
work_keys_str_mv AT kimdeokhoon revisingapersonalgenomebycomparingandcombiningdatafromtwodifferentsequencingplatforms
AT kimwooyeon revisingapersonalgenomebycomparingandcombiningdatafromtwodifferentsequencingplatforms
AT leesunyoung revisingapersonalgenomebycomparingandcombiningdatafromtwodifferentsequencingplatforms
AT leesungyeoun revisingapersonalgenomebycomparingandcombiningdatafromtwodifferentsequencingplatforms
AT yunhongseok revisingapersonalgenomebycomparingandcombiningdatafromtwodifferentsequencingplatforms
AT shinsooyong revisingapersonalgenomebycomparingandcombiningdatafromtwodifferentsequencingplatforms
AT leejungyoun revisingapersonalgenomebycomparingandcombiningdatafromtwodifferentsequencingplatforms
AT hongyoojin revisingapersonalgenomebycomparingandcombiningdatafromtwodifferentsequencingplatforms
AT wonyoungmi revisingapersonalgenomebycomparingandcombiningdatafromtwodifferentsequencingplatforms
AT kimseongjin revisingapersonalgenomebycomparingandcombiningdatafromtwodifferentsequencingplatforms
AT leeyongseok revisingapersonalgenomebycomparingandcombiningdatafromtwodifferentsequencingplatforms
AT ahnsungmin revisingapersonalgenomebycomparingandcombiningdatafromtwodifferentsequencingplatforms