Cargando…

Development of the variant calling algorithm, ADIScan, and its use to estimate discordant sequences between monozygotic twins

Calling variants from next-generation sequencing (NGS) data or discovering discordant sequences between two NGS data sets is challenging. We developed a computer algorithm, ADIScan1, to call variants by comparing the fractions of allelic reads in a tester to the universal reference genome. We then c...

Descripción completa

Detalles Bibliográficos
Autores principales: Cho, Yangrae, Lee, Sunho, Hong, Jong Hui, Kim, Byong Joon, Hong, Woon-Young, Jung, Jongcheol, Lee, Hyang Burm, Sung, Joohon, Kim, Han-Na, Kim, Hyung-Lae, Jung, Jongsun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6125643/
https://www.ncbi.nlm.nih.gov/pubmed/29873758
http://dx.doi.org/10.1093/nar/gky445
_version_ 1783353197813825536
author Cho, Yangrae
Lee, Sunho
Hong, Jong Hui
Kim, Byong Joon
Hong, Woon-Young
Jung, Jongcheol
Lee, Hyang Burm
Sung, Joohon
Kim, Han-Na
Kim, Hyung-Lae
Jung, Jongsun
author_facet Cho, Yangrae
Lee, Sunho
Hong, Jong Hui
Kim, Byong Joon
Hong, Woon-Young
Jung, Jongcheol
Lee, Hyang Burm
Sung, Joohon
Kim, Han-Na
Kim, Hyung-Lae
Jung, Jongsun
author_sort Cho, Yangrae
collection PubMed
description Calling variants from next-generation sequencing (NGS) data or discovering discordant sequences between two NGS data sets is challenging. We developed a computer algorithm, ADIScan1, to call variants by comparing the fractions of allelic reads in a tester to the universal reference genome. We then created ADIScan2 by modifying the algorithm to directly compare two sets of NGS data and predict discordant sequences between two testers. ADIScan1 detected >99.7% of variants called by GATK with an additional 724 393 SNVs. ADIScan2 identified ∼500 candidates of discordant sequences in each of two pairs of the monozygotic twins. About 200 of these candidates were included in the ∼2800 predicted by VarScan2. We verified 66 true discordant sequences among the candidates that ADIScan2 and VarScan2 exclusively predicted. ADIScan2 detected many discordant sequences overlooked by VarScan2 and Mutect, which specialize in detecting low frequency mutations in genetically heterogeneous cancerous tissues. Numbers of verified sequences alone were >5 times more than expected based on recently estimated mutation rates from whole genome sequences. Estimated post-zygotic mutation rates were 1.68 × 10(−7) in this study. ADIScan1 and 2 would complement existing tools in screening causative mutations of diverse genetic diseases and comparing two sets of genome sequences, respectively.
format Online
Article
Text
id pubmed-6125643
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-61256432018-09-11 Development of the variant calling algorithm, ADIScan, and its use to estimate discordant sequences between monozygotic twins Cho, Yangrae Lee, Sunho Hong, Jong Hui Kim, Byong Joon Hong, Woon-Young Jung, Jongcheol Lee, Hyang Burm Sung, Joohon Kim, Han-Na Kim, Hyung-Lae Jung, Jongsun Nucleic Acids Res Methods Online Calling variants from next-generation sequencing (NGS) data or discovering discordant sequences between two NGS data sets is challenging. We developed a computer algorithm, ADIScan1, to call variants by comparing the fractions of allelic reads in a tester to the universal reference genome. We then created ADIScan2 by modifying the algorithm to directly compare two sets of NGS data and predict discordant sequences between two testers. ADIScan1 detected >99.7% of variants called by GATK with an additional 724 393 SNVs. ADIScan2 identified ∼500 candidates of discordant sequences in each of two pairs of the monozygotic twins. About 200 of these candidates were included in the ∼2800 predicted by VarScan2. We verified 66 true discordant sequences among the candidates that ADIScan2 and VarScan2 exclusively predicted. ADIScan2 detected many discordant sequences overlooked by VarScan2 and Mutect, which specialize in detecting low frequency mutations in genetically heterogeneous cancerous tissues. Numbers of verified sequences alone were >5 times more than expected based on recently estimated mutation rates from whole genome sequences. Estimated post-zygotic mutation rates were 1.68 × 10(−7) in this study. ADIScan1 and 2 would complement existing tools in screening causative mutations of diverse genetic diseases and comparing two sets of genome sequences, respectively. Oxford University Press 2018-09-06 2018-06-05 /pmc/articles/PMC6125643/ /pubmed/29873758 http://dx.doi.org/10.1093/nar/gky445 Text en © The Author(s) 2018. Published by Oxford University Press on behalf of Nucleic Acids Research. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Methods Online
Cho, Yangrae
Lee, Sunho
Hong, Jong Hui
Kim, Byong Joon
Hong, Woon-Young
Jung, Jongcheol
Lee, Hyang Burm
Sung, Joohon
Kim, Han-Na
Kim, Hyung-Lae
Jung, Jongsun
Development of the variant calling algorithm, ADIScan, and its use to estimate discordant sequences between monozygotic twins
title Development of the variant calling algorithm, ADIScan, and its use to estimate discordant sequences between monozygotic twins
title_full Development of the variant calling algorithm, ADIScan, and its use to estimate discordant sequences between monozygotic twins
title_fullStr Development of the variant calling algorithm, ADIScan, and its use to estimate discordant sequences between monozygotic twins
title_full_unstemmed Development of the variant calling algorithm, ADIScan, and its use to estimate discordant sequences between monozygotic twins
title_short Development of the variant calling algorithm, ADIScan, and its use to estimate discordant sequences between monozygotic twins
title_sort development of the variant calling algorithm, adiscan, and its use to estimate discordant sequences between monozygotic twins
topic Methods Online
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6125643/
https://www.ncbi.nlm.nih.gov/pubmed/29873758
http://dx.doi.org/10.1093/nar/gky445
work_keys_str_mv AT choyangrae developmentofthevariantcallingalgorithmadiscananditsusetoestimatediscordantsequencesbetweenmonozygotictwins
AT leesunho developmentofthevariantcallingalgorithmadiscananditsusetoestimatediscordantsequencesbetweenmonozygotictwins
AT hongjonghui developmentofthevariantcallingalgorithmadiscananditsusetoestimatediscordantsequencesbetweenmonozygotictwins
AT kimbyongjoon developmentofthevariantcallingalgorithmadiscananditsusetoestimatediscordantsequencesbetweenmonozygotictwins
AT hongwoonyoung developmentofthevariantcallingalgorithmadiscananditsusetoestimatediscordantsequencesbetweenmonozygotictwins
AT jungjongcheol developmentofthevariantcallingalgorithmadiscananditsusetoestimatediscordantsequencesbetweenmonozygotictwins
AT leehyangburm developmentofthevariantcallingalgorithmadiscananditsusetoestimatediscordantsequencesbetweenmonozygotictwins
AT sungjoohon developmentofthevariantcallingalgorithmadiscananditsusetoestimatediscordantsequencesbetweenmonozygotictwins
AT kimhanna developmentofthevariantcallingalgorithmadiscananditsusetoestimatediscordantsequencesbetweenmonozygotictwins
AT kimhyunglae developmentofthevariantcallingalgorithmadiscananditsusetoestimatediscordantsequencesbetweenmonozygotictwins
AT jungjongsun developmentofthevariantcallingalgorithmadiscananditsusetoestimatediscordantsequencesbetweenmonozygotictwins