Cargando…

Establishment of reference standards for multifaceted mosaic variant analysis

Detection of somatic mosaicism in non-proliferative cells is a new challenge in genome research, however, the accuracy of current detection strategies remains uncertain due to the lack of a ground truth. Herein, we sought to present a set of ultra-deep sequenced WES data based on reference standards...

Descripción completa

Detalles Bibliográficos
Autores principales: Ha, Yoo-Jin, Oh, Myung Joon, Kim, Junhan, Kim, Jisoo, Kang, Seungseok, Minna, John D., Kim, Hyun Seok, Kim, Sangwoo
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8813952/
https://www.ncbi.nlm.nih.gov/pubmed/35115554
http://dx.doi.org/10.1038/s41597-022-01133-8
_version_ 1784644968771485696
author Ha, Yoo-Jin
Oh, Myung Joon
Kim, Junhan
Kim, Jisoo
Kang, Seungseok
Minna, John D.
Kim, Hyun Seok
Kim, Sangwoo
author_facet Ha, Yoo-Jin
Oh, Myung Joon
Kim, Junhan
Kim, Jisoo
Kang, Seungseok
Minna, John D.
Kim, Hyun Seok
Kim, Sangwoo
author_sort Ha, Yoo-Jin
collection PubMed
description Detection of somatic mosaicism in non-proliferative cells is a new challenge in genome research, however, the accuracy of current detection strategies remains uncertain due to the lack of a ground truth. Herein, we sought to present a set of ultra-deep sequenced WES data based on reference standards generated by cell line mixtures, providing a total of 386,613 mosaic single-nucleotide variants (SNVs) and insertion-deletion mutations (INDELs) with variant allele frequencies (VAFs) ranging from 0.5% to 56%, as well as 35,113,417 non-variant and 19,936 germline variant sites as a negative control. The whole reference standard set mimics the cumulative aspect of mosaic variant acquisition such as in the early developmental stage owing to the progressive mixing of cell lines with established genotypes, ultimately unveiling 741 possible inter-sample relationships with respect to variant sharing and asymmetry in VAFs. We expect that our reference data will be essential for optimizing the current use of mosaic variant detection strategies and for developing algorithms to enable future improvements.
format Online
Article
Text
id pubmed-8813952
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-88139522022-02-10 Establishment of reference standards for multifaceted mosaic variant analysis Ha, Yoo-Jin Oh, Myung Joon Kim, Junhan Kim, Jisoo Kang, Seungseok Minna, John D. Kim, Hyun Seok Kim, Sangwoo Sci Data Data Descriptor Detection of somatic mosaicism in non-proliferative cells is a new challenge in genome research, however, the accuracy of current detection strategies remains uncertain due to the lack of a ground truth. Herein, we sought to present a set of ultra-deep sequenced WES data based on reference standards generated by cell line mixtures, providing a total of 386,613 mosaic single-nucleotide variants (SNVs) and insertion-deletion mutations (INDELs) with variant allele frequencies (VAFs) ranging from 0.5% to 56%, as well as 35,113,417 non-variant and 19,936 germline variant sites as a negative control. The whole reference standard set mimics the cumulative aspect of mosaic variant acquisition such as in the early developmental stage owing to the progressive mixing of cell lines with established genotypes, ultimately unveiling 741 possible inter-sample relationships with respect to variant sharing and asymmetry in VAFs. We expect that our reference data will be essential for optimizing the current use of mosaic variant detection strategies and for developing algorithms to enable future improvements. Nature Publishing Group UK 2022-02-03 /pmc/articles/PMC8813952/ /pubmed/35115554 http://dx.doi.org/10.1038/s41597-022-01133-8 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) applies to the metadata files associated with this article.
spellingShingle Data Descriptor
Ha, Yoo-Jin
Oh, Myung Joon
Kim, Junhan
Kim, Jisoo
Kang, Seungseok
Minna, John D.
Kim, Hyun Seok
Kim, Sangwoo
Establishment of reference standards for multifaceted mosaic variant analysis
title Establishment of reference standards for multifaceted mosaic variant analysis
title_full Establishment of reference standards for multifaceted mosaic variant analysis
title_fullStr Establishment of reference standards for multifaceted mosaic variant analysis
title_full_unstemmed Establishment of reference standards for multifaceted mosaic variant analysis
title_short Establishment of reference standards for multifaceted mosaic variant analysis
title_sort establishment of reference standards for multifaceted mosaic variant analysis
topic Data Descriptor
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8813952/
https://www.ncbi.nlm.nih.gov/pubmed/35115554
http://dx.doi.org/10.1038/s41597-022-01133-8
work_keys_str_mv AT hayoojin establishmentofreferencestandardsformultifacetedmosaicvariantanalysis
AT ohmyungjoon establishmentofreferencestandardsformultifacetedmosaicvariantanalysis
AT kimjunhan establishmentofreferencestandardsformultifacetedmosaicvariantanalysis
AT kimjisoo establishmentofreferencestandardsformultifacetedmosaicvariantanalysis
AT kangseungseok establishmentofreferencestandardsformultifacetedmosaicvariantanalysis
AT minnajohnd establishmentofreferencestandardsformultifacetedmosaicvariantanalysis
AT kimhyunseok establishmentofreferencestandardsformultifacetedmosaicvariantanalysis
AT kimsangwoo establishmentofreferencestandardsformultifacetedmosaicvariantanalysis