Cargando…

Evaluation of computational genotyping of structural variation for clinical diagnoses

BACKGROUND: Structural variation (SV) plays a pivotal role in genetic disease. The discovery of SVs based on short DNA sequence reads from next-generation DNA sequence methods is error-prone, with low sensitivity and high false discovery rates. These shortcomings can be partially overcome with exten...

Descripción completa

Detalles Bibliográficos
Autores principales: Chander, Varuna, Gibbs, Richard A, Sedlazeck, Fritz J
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6732172/
https://www.ncbi.nlm.nih.gov/pubmed/31494671
http://dx.doi.org/10.1093/gigascience/giz110
_version_ 1783449778568298496
author Chander, Varuna
Gibbs, Richard A
Sedlazeck, Fritz J
author_facet Chander, Varuna
Gibbs, Richard A
Sedlazeck, Fritz J
author_sort Chander, Varuna
collection PubMed
description BACKGROUND: Structural variation (SV) plays a pivotal role in genetic disease. The discovery of SVs based on short DNA sequence reads from next-generation DNA sequence methods is error-prone, with low sensitivity and high false discovery rates. These shortcomings can be partially overcome with extensive orthogonal validation methods or use of long reads, but the current cost precludes their application for routine clinical diagnostics. In contrast, SV genotyping of known sites of SV occurrence is relatively robust and therefore offers a cost-effective clinical diagnostic tool with potentially few false-positive and false-negative results, even when applied to short-read DNA sequence data. RESULTS: We assess 5 state-of-the-art SV genotyping software methods, applied to short-read sequence data. The methods are characterized on the basis of their ability to genotype different SV types, spanning different size ranges. Furthermore, we analyze their ability to parse different VCF file subformats and assess their reliance on specific metadata. We compare the SV genotyping methods across a range of simulated and real data including SVs that were not found with Illumina data alone. We assess sensitivity and the ability to filter initial false discovery calls. We determined the impact of SV type and size on the performance for each SV genotyper. Overall, STIX performed the best on both simulated and GiaB based SV calls, demonstrating a good balance between sensitivity and specificty. CONCLUSION: Our results indicate that, although SV genotyping software methods have superior performance to SV callers, there are limitations that suggest the need for further innovation.
format Online
Article
Text
id pubmed-6732172
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-67321722019-09-12 Evaluation of computational genotyping of structural variation for clinical diagnoses Chander, Varuna Gibbs, Richard A Sedlazeck, Fritz J Gigascience Research BACKGROUND: Structural variation (SV) plays a pivotal role in genetic disease. The discovery of SVs based on short DNA sequence reads from next-generation DNA sequence methods is error-prone, with low sensitivity and high false discovery rates. These shortcomings can be partially overcome with extensive orthogonal validation methods or use of long reads, but the current cost precludes their application for routine clinical diagnostics. In contrast, SV genotyping of known sites of SV occurrence is relatively robust and therefore offers a cost-effective clinical diagnostic tool with potentially few false-positive and false-negative results, even when applied to short-read DNA sequence data. RESULTS: We assess 5 state-of-the-art SV genotyping software methods, applied to short-read sequence data. The methods are characterized on the basis of their ability to genotype different SV types, spanning different size ranges. Furthermore, we analyze their ability to parse different VCF file subformats and assess their reliance on specific metadata. We compare the SV genotyping methods across a range of simulated and real data including SVs that were not found with Illumina data alone. We assess sensitivity and the ability to filter initial false discovery calls. We determined the impact of SV type and size on the performance for each SV genotyper. Overall, STIX performed the best on both simulated and GiaB based SV calls, demonstrating a good balance between sensitivity and specificty. CONCLUSION: Our results indicate that, although SV genotyping software methods have superior performance to SV callers, there are limitations that suggest the need for further innovation. Oxford University Press 2019-09-08 /pmc/articles/PMC6732172/ /pubmed/31494671 http://dx.doi.org/10.1093/gigascience/giz110 Text en © The Author(s) 2019. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Chander, Varuna
Gibbs, Richard A
Sedlazeck, Fritz J
Evaluation of computational genotyping of structural variation for clinical diagnoses
title Evaluation of computational genotyping of structural variation for clinical diagnoses
title_full Evaluation of computational genotyping of structural variation for clinical diagnoses
title_fullStr Evaluation of computational genotyping of structural variation for clinical diagnoses
title_full_unstemmed Evaluation of computational genotyping of structural variation for clinical diagnoses
title_short Evaluation of computational genotyping of structural variation for clinical diagnoses
title_sort evaluation of computational genotyping of structural variation for clinical diagnoses
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6732172/
https://www.ncbi.nlm.nih.gov/pubmed/31494671
http://dx.doi.org/10.1093/gigascience/giz110
work_keys_str_mv AT chandervaruna evaluationofcomputationalgenotypingofstructuralvariationforclinicaldiagnoses
AT gibbsricharda evaluationofcomputationalgenotypingofstructuralvariationforclinicaldiagnoses
AT sedlazeckfritzj evaluationofcomputationalgenotypingofstructuralvariationforclinicaldiagnoses