Cargando…

Impact of DNA source on genetic variant detection from human whole-genome sequencing data

BACKGROUND: Whole blood is currently the most common DNA source for whole-genome sequencing (WGS), but for studies requiring non-invasive collection, self-collection, greater sample stability or additional tissue references, saliva or buccal samples may be preferred. However, the relative quality of...

Descripción completa

Detalles Bibliográficos
Autores principales: Trost, Brett, Walker, Susan, Haider, Syed A, Sung, Wilson W L, Pereira, Sergio, Phillips, Charly L, Higginbotham, Edward J, Strug, Lisa J, Nguyen, Charlotte, Raajkumar, Akshaya, Szego, Michael J, Marshall, Christian R, Scherer, Stephen W
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BMJ Publishing Group 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6929712/
https://www.ncbi.nlm.nih.gov/pubmed/31515274
http://dx.doi.org/10.1136/jmedgenet-2019-106281
_version_ 1783482756231069696
author Trost, Brett
Walker, Susan
Haider, Syed A
Sung, Wilson W L
Pereira, Sergio
Phillips, Charly L
Higginbotham, Edward J
Strug, Lisa J
Nguyen, Charlotte
Raajkumar, Akshaya
Szego, Michael J
Marshall, Christian R
Scherer, Stephen W
author_facet Trost, Brett
Walker, Susan
Haider, Syed A
Sung, Wilson W L
Pereira, Sergio
Phillips, Charly L
Higginbotham, Edward J
Strug, Lisa J
Nguyen, Charlotte
Raajkumar, Akshaya
Szego, Michael J
Marshall, Christian R
Scherer, Stephen W
author_sort Trost, Brett
collection PubMed
description BACKGROUND: Whole blood is currently the most common DNA source for whole-genome sequencing (WGS), but for studies requiring non-invasive collection, self-collection, greater sample stability or additional tissue references, saliva or buccal samples may be preferred. However, the relative quality of sequencing data and accuracy of genetic variant detection from blood-derived, saliva-derived and buccal-derived DNA need to be thoroughly investigated. METHODS: Matched blood, saliva and buccal samples from four unrelated individuals were used to compare sequencing metrics and variant-detection accuracy among these DNA sources. RESULTS: We observed significant differences among DNA sources for sequencing quality metrics such as percentage of reads aligned and mean read depth (p<0.05). Differences were negligible in the accuracy of detecting short insertions and deletions; however, the false positive rate for single nucleotide variation detection was slightly higher in some saliva and buccal samples. The sensitivity of copy number variant (CNV) detection was up to 25% higher in blood samples, depending on CNV size and type, and appeared to be worse in saliva and buccal samples with high bacterial concentration. We also show that methylation-based enrichment for eukaryotic DNA in saliva and buccal samples increased alignment rates but also reduced read-depth uniformity, hampering CNV detection. CONCLUSION: For WGS, we recommend using DNA extracted from blood rather than saliva or buccal swabs; if saliva or buccal samples are used, we recommend against using methylation-based eukaryotic DNA enrichment. All data used in this study are available for further open-science investigation.
format Online
Article
Text
id pubmed-6929712
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BMJ Publishing Group
record_format MEDLINE/PubMed
spelling pubmed-69297122020-01-06 Impact of DNA source on genetic variant detection from human whole-genome sequencing data Trost, Brett Walker, Susan Haider, Syed A Sung, Wilson W L Pereira, Sergio Phillips, Charly L Higginbotham, Edward J Strug, Lisa J Nguyen, Charlotte Raajkumar, Akshaya Szego, Michael J Marshall, Christian R Scherer, Stephen W J Med Genet Methods BACKGROUND: Whole blood is currently the most common DNA source for whole-genome sequencing (WGS), but for studies requiring non-invasive collection, self-collection, greater sample stability or additional tissue references, saliva or buccal samples may be preferred. However, the relative quality of sequencing data and accuracy of genetic variant detection from blood-derived, saliva-derived and buccal-derived DNA need to be thoroughly investigated. METHODS: Matched blood, saliva and buccal samples from four unrelated individuals were used to compare sequencing metrics and variant-detection accuracy among these DNA sources. RESULTS: We observed significant differences among DNA sources for sequencing quality metrics such as percentage of reads aligned and mean read depth (p<0.05). Differences were negligible in the accuracy of detecting short insertions and deletions; however, the false positive rate for single nucleotide variation detection was slightly higher in some saliva and buccal samples. The sensitivity of copy number variant (CNV) detection was up to 25% higher in blood samples, depending on CNV size and type, and appeared to be worse in saliva and buccal samples with high bacterial concentration. We also show that methylation-based enrichment for eukaryotic DNA in saliva and buccal samples increased alignment rates but also reduced read-depth uniformity, hampering CNV detection. CONCLUSION: For WGS, we recommend using DNA extracted from blood rather than saliva or buccal swabs; if saliva or buccal samples are used, we recommend against using methylation-based eukaryotic DNA enrichment. All data used in this study are available for further open-science investigation. BMJ Publishing Group 2019-12 2019-09-12 /pmc/articles/PMC6929712/ /pubmed/31515274 http://dx.doi.org/10.1136/jmedgenet-2019-106281 Text en © Author(s) (or their employer(s)) 2019. Re-use permitted under CC BY-NC. No commercial re-use. See rights and permissions. Published by BMJ. This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.
spellingShingle Methods
Trost, Brett
Walker, Susan
Haider, Syed A
Sung, Wilson W L
Pereira, Sergio
Phillips, Charly L
Higginbotham, Edward J
Strug, Lisa J
Nguyen, Charlotte
Raajkumar, Akshaya
Szego, Michael J
Marshall, Christian R
Scherer, Stephen W
Impact of DNA source on genetic variant detection from human whole-genome sequencing data
title Impact of DNA source on genetic variant detection from human whole-genome sequencing data
title_full Impact of DNA source on genetic variant detection from human whole-genome sequencing data
title_fullStr Impact of DNA source on genetic variant detection from human whole-genome sequencing data
title_full_unstemmed Impact of DNA source on genetic variant detection from human whole-genome sequencing data
title_short Impact of DNA source on genetic variant detection from human whole-genome sequencing data
title_sort impact of dna source on genetic variant detection from human whole-genome sequencing data
topic Methods
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6929712/
https://www.ncbi.nlm.nih.gov/pubmed/31515274
http://dx.doi.org/10.1136/jmedgenet-2019-106281
work_keys_str_mv AT trostbrett impactofdnasourceongeneticvariantdetectionfromhumanwholegenomesequencingdata
AT walkersusan impactofdnasourceongeneticvariantdetectionfromhumanwholegenomesequencingdata
AT haidersyeda impactofdnasourceongeneticvariantdetectionfromhumanwholegenomesequencingdata
AT sungwilsonwl impactofdnasourceongeneticvariantdetectionfromhumanwholegenomesequencingdata
AT pereirasergio impactofdnasourceongeneticvariantdetectionfromhumanwholegenomesequencingdata
AT phillipscharlyl impactofdnasourceongeneticvariantdetectionfromhumanwholegenomesequencingdata
AT higginbothamedwardj impactofdnasourceongeneticvariantdetectionfromhumanwholegenomesequencingdata
AT struglisaj impactofdnasourceongeneticvariantdetectionfromhumanwholegenomesequencingdata
AT nguyencharlotte impactofdnasourceongeneticvariantdetectionfromhumanwholegenomesequencingdata
AT raajkumarakshaya impactofdnasourceongeneticvariantdetectionfromhumanwholegenomesequencingdata
AT szegomichaelj impactofdnasourceongeneticvariantdetectionfromhumanwholegenomesequencingdata
AT marshallchristianr impactofdnasourceongeneticvariantdetectionfromhumanwholegenomesequencingdata
AT schererstephenw impactofdnasourceongeneticvariantdetectionfromhumanwholegenomesequencingdata