Cargando…

Replicate whole-genome next-generation sequencing data derived from Caucasian donor saliva samples

Next-generation sequencing (NGS) of whole genomes has become more accessible to biomedical researchers as the sequencing price continues to drop, and more laboratories have NGS facilities or have access to a core facility. However, the rapid and robust development of practical bioinformatics pipelin...

Descripción completa

Detalles Bibliográficos
Autores principales: Hansen, Marcus Høy, Nyvold, Charlotte Guldborg
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8427263/
https://www.ncbi.nlm.nih.gov/pubmed/34522736
http://dx.doi.org/10.1016/j.dib.2021.107349
_version_ 1783750158282915840
author Hansen, Marcus Høy
Nyvold, Charlotte Guldborg
author_facet Hansen, Marcus Høy
Nyvold, Charlotte Guldborg
author_sort Hansen, Marcus Høy
collection PubMed
description Next-generation sequencing (NGS) of whole genomes has become more accessible to biomedical researchers as the sequencing price continues to drop, and more laboratories have NGS facilities or have access to a core facility. However, the rapid and robust development of practical bioinformatics pipelines partly depends on convenient access to data for the testing of algorithms. Publicly available data sets constitute a part of this strategy. Here, we provide a triplicate whole-genome paired-end sequencing data set, consisting of 1.38 billion raw sequencing reads derived from saliva DNA from a single anonymous male Caucasian donor, with the average sequencing depths aimed at 30x for two of the samples and 4x for a low-coverage sample. The raw number of single nucleotide variants were 3.3–4 million and the median variant read depth of GATK4-passed variants in three samples was 22, 18, and 10. 81% of all variants were found in two or three of the samples, whereas 19% were singletons. The karyotype was evaluated as 46,XY with no apparent copy-number variation. The data set is provided without restrictions for research, educational or commercial purposes.
format Online
Article
Text
id pubmed-8427263
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-84272632021-09-13 Replicate whole-genome next-generation sequencing data derived from Caucasian donor saliva samples Hansen, Marcus Høy Nyvold, Charlotte Guldborg Data Brief Data Article Next-generation sequencing (NGS) of whole genomes has become more accessible to biomedical researchers as the sequencing price continues to drop, and more laboratories have NGS facilities or have access to a core facility. However, the rapid and robust development of practical bioinformatics pipelines partly depends on convenient access to data for the testing of algorithms. Publicly available data sets constitute a part of this strategy. Here, we provide a triplicate whole-genome paired-end sequencing data set, consisting of 1.38 billion raw sequencing reads derived from saliva DNA from a single anonymous male Caucasian donor, with the average sequencing depths aimed at 30x for two of the samples and 4x for a low-coverage sample. The raw number of single nucleotide variants were 3.3–4 million and the median variant read depth of GATK4-passed variants in three samples was 22, 18, and 10. 81% of all variants were found in two or three of the samples, whereas 19% were singletons. The karyotype was evaluated as 46,XY with no apparent copy-number variation. The data set is provided without restrictions for research, educational or commercial purposes. Elsevier 2021-09-04 /pmc/articles/PMC8427263/ /pubmed/34522736 http://dx.doi.org/10.1016/j.dib.2021.107349 Text en © 2021 The Author(s). Published by Elsevier Inc. https://creativecommons.org/licenses/by/4.0/This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Data Article
Hansen, Marcus Høy
Nyvold, Charlotte Guldborg
Replicate whole-genome next-generation sequencing data derived from Caucasian donor saliva samples
title Replicate whole-genome next-generation sequencing data derived from Caucasian donor saliva samples
title_full Replicate whole-genome next-generation sequencing data derived from Caucasian donor saliva samples
title_fullStr Replicate whole-genome next-generation sequencing data derived from Caucasian donor saliva samples
title_full_unstemmed Replicate whole-genome next-generation sequencing data derived from Caucasian donor saliva samples
title_short Replicate whole-genome next-generation sequencing data derived from Caucasian donor saliva samples
title_sort replicate whole-genome next-generation sequencing data derived from caucasian donor saliva samples
topic Data Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8427263/
https://www.ncbi.nlm.nih.gov/pubmed/34522736
http://dx.doi.org/10.1016/j.dib.2021.107349
work_keys_str_mv AT hansenmarcushøy replicatewholegenomenextgenerationsequencingdataderivedfromcaucasiandonorsalivasamples
AT nyvoldcharlotteguldborg replicatewholegenomenextgenerationsequencingdataderivedfromcaucasiandonorsalivasamples