Cargando…
Replicate whole-genome next-generation sequencing data derived from Caucasian donor saliva samples
Next-generation sequencing (NGS) of whole genomes has become more accessible to biomedical researchers as the sequencing price continues to drop, and more laboratories have NGS facilities or have access to a core facility. However, the rapid and robust development of practical bioinformatics pipelin...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Elsevier
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8427263/ https://www.ncbi.nlm.nih.gov/pubmed/34522736 http://dx.doi.org/10.1016/j.dib.2021.107349 |
_version_ | 1783750158282915840 |
---|---|
author | Hansen, Marcus Høy Nyvold, Charlotte Guldborg |
author_facet | Hansen, Marcus Høy Nyvold, Charlotte Guldborg |
author_sort | Hansen, Marcus Høy |
collection | PubMed |
description | Next-generation sequencing (NGS) of whole genomes has become more accessible to biomedical researchers as the sequencing price continues to drop, and more laboratories have NGS facilities or have access to a core facility. However, the rapid and robust development of practical bioinformatics pipelines partly depends on convenient access to data for the testing of algorithms. Publicly available data sets constitute a part of this strategy. Here, we provide a triplicate whole-genome paired-end sequencing data set, consisting of 1.38 billion raw sequencing reads derived from saliva DNA from a single anonymous male Caucasian donor, with the average sequencing depths aimed at 30x for two of the samples and 4x for a low-coverage sample. The raw number of single nucleotide variants were 3.3–4 million and the median variant read depth of GATK4-passed variants in three samples was 22, 18, and 10. 81% of all variants were found in two or three of the samples, whereas 19% were singletons. The karyotype was evaluated as 46,XY with no apparent copy-number variation. The data set is provided without restrictions for research, educational or commercial purposes. |
format | Online Article Text |
id | pubmed-8427263 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Elsevier |
record_format | MEDLINE/PubMed |
spelling | pubmed-84272632021-09-13 Replicate whole-genome next-generation sequencing data derived from Caucasian donor saliva samples Hansen, Marcus Høy Nyvold, Charlotte Guldborg Data Brief Data Article Next-generation sequencing (NGS) of whole genomes has become more accessible to biomedical researchers as the sequencing price continues to drop, and more laboratories have NGS facilities or have access to a core facility. However, the rapid and robust development of practical bioinformatics pipelines partly depends on convenient access to data for the testing of algorithms. Publicly available data sets constitute a part of this strategy. Here, we provide a triplicate whole-genome paired-end sequencing data set, consisting of 1.38 billion raw sequencing reads derived from saliva DNA from a single anonymous male Caucasian donor, with the average sequencing depths aimed at 30x for two of the samples and 4x for a low-coverage sample. The raw number of single nucleotide variants were 3.3–4 million and the median variant read depth of GATK4-passed variants in three samples was 22, 18, and 10. 81% of all variants were found in two or three of the samples, whereas 19% were singletons. The karyotype was evaluated as 46,XY with no apparent copy-number variation. The data set is provided without restrictions for research, educational or commercial purposes. Elsevier 2021-09-04 /pmc/articles/PMC8427263/ /pubmed/34522736 http://dx.doi.org/10.1016/j.dib.2021.107349 Text en © 2021 The Author(s). Published by Elsevier Inc. https://creativecommons.org/licenses/by/4.0/This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Data Article Hansen, Marcus Høy Nyvold, Charlotte Guldborg Replicate whole-genome next-generation sequencing data derived from Caucasian donor saliva samples |
title | Replicate whole-genome next-generation sequencing data derived from Caucasian donor saliva samples |
title_full | Replicate whole-genome next-generation sequencing data derived from Caucasian donor saliva samples |
title_fullStr | Replicate whole-genome next-generation sequencing data derived from Caucasian donor saliva samples |
title_full_unstemmed | Replicate whole-genome next-generation sequencing data derived from Caucasian donor saliva samples |
title_short | Replicate whole-genome next-generation sequencing data derived from Caucasian donor saliva samples |
title_sort | replicate whole-genome next-generation sequencing data derived from caucasian donor saliva samples |
topic | Data Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8427263/ https://www.ncbi.nlm.nih.gov/pubmed/34522736 http://dx.doi.org/10.1016/j.dib.2021.107349 |
work_keys_str_mv | AT hansenmarcushøy replicatewholegenomenextgenerationsequencingdataderivedfromcaucasiandonorsalivasamples AT nyvoldcharlotteguldborg replicatewholegenomenextgenerationsequencingdataderivedfromcaucasiandonorsalivasamples |