Cargando…

Reducing Sanger confirmation testing through false positive prediction algorithms

PURPOSE: Clinical genome sequencing (cGS) followed by orthogonal confirmatory testing is standard practice. While orthogonal testing significantly improves specificity, it also results in increased turnaround time and cost of testing. The purpose of this study is to evaluate machine learning models...

Descripción completa

Detalles Bibliográficos
Autores principales: Holt, James M., Kelly, Melissa, Sundlof, Brett, Nakouzi, Ghunwa, Bick, David, Lyon, Elaine
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group US 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8257489/
https://www.ncbi.nlm.nih.gov/pubmed/33767343
http://dx.doi.org/10.1038/s41436-021-01148-3
_version_ 1783718325302329344
author Holt, James M.
Kelly, Melissa
Sundlof, Brett
Nakouzi, Ghunwa
Bick, David
Lyon, Elaine
author_facet Holt, James M.
Kelly, Melissa
Sundlof, Brett
Nakouzi, Ghunwa
Bick, David
Lyon, Elaine
author_sort Holt, James M.
collection PubMed
description PURPOSE: Clinical genome sequencing (cGS) followed by orthogonal confirmatory testing is standard practice. While orthogonal testing significantly improves specificity, it also results in increased turnaround time and cost of testing. The purpose of this study is to evaluate machine learning models trained to identify false positive variants in cGS data to reduce the need for orthogonal testing. METHODS: We sequenced five reference human genome samples characterized by the Genome in a Bottle Consortium (GIAB) and compared the results with an established set of variants for each genome referred to as a truth set. We then trained machine learning models to identify variants that were labeled as false positives. RESULTS: After training, the models identified 99.5% of the false positive heterozygous single-nucleotide variants (SNVs) and heterozygous insertions/deletions variants (indels) while reducing confirmatory testing of nonactionable, nonprimary SNVs by 85% and indels by 75%. Employing the algorithm in clinical practice reduced overall orthogonal testing using dideoxynucleotide (Sanger) sequencing by 71%. CONCLUSION: Our results indicate that a low false positive call rate can be maintained while significantly reducing the need for confirmatory testing. The framework that generated our models and results is publicly available at https://github.com/HudsonAlpha/STEVE.
format Online
Article
Text
id pubmed-8257489
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Nature Publishing Group US
record_format MEDLINE/PubMed
spelling pubmed-82574892021-07-23 Reducing Sanger confirmation testing through false positive prediction algorithms Holt, James M. Kelly, Melissa Sundlof, Brett Nakouzi, Ghunwa Bick, David Lyon, Elaine Genet Med Article PURPOSE: Clinical genome sequencing (cGS) followed by orthogonal confirmatory testing is standard practice. While orthogonal testing significantly improves specificity, it also results in increased turnaround time and cost of testing. The purpose of this study is to evaluate machine learning models trained to identify false positive variants in cGS data to reduce the need for orthogonal testing. METHODS: We sequenced five reference human genome samples characterized by the Genome in a Bottle Consortium (GIAB) and compared the results with an established set of variants for each genome referred to as a truth set. We then trained machine learning models to identify variants that were labeled as false positives. RESULTS: After training, the models identified 99.5% of the false positive heterozygous single-nucleotide variants (SNVs) and heterozygous insertions/deletions variants (indels) while reducing confirmatory testing of nonactionable, nonprimary SNVs by 85% and indels by 75%. Employing the algorithm in clinical practice reduced overall orthogonal testing using dideoxynucleotide (Sanger) sequencing by 71%. CONCLUSION: Our results indicate that a low false positive call rate can be maintained while significantly reducing the need for confirmatory testing. The framework that generated our models and results is publicly available at https://github.com/HudsonAlpha/STEVE. Nature Publishing Group US 2021-03-25 2021 /pmc/articles/PMC8257489/ /pubmed/33767343 http://dx.doi.org/10.1038/s41436-021-01148-3 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Holt, James M.
Kelly, Melissa
Sundlof, Brett
Nakouzi, Ghunwa
Bick, David
Lyon, Elaine
Reducing Sanger confirmation testing through false positive prediction algorithms
title Reducing Sanger confirmation testing through false positive prediction algorithms
title_full Reducing Sanger confirmation testing through false positive prediction algorithms
title_fullStr Reducing Sanger confirmation testing through false positive prediction algorithms
title_full_unstemmed Reducing Sanger confirmation testing through false positive prediction algorithms
title_short Reducing Sanger confirmation testing through false positive prediction algorithms
title_sort reducing sanger confirmation testing through false positive prediction algorithms
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8257489/
https://www.ncbi.nlm.nih.gov/pubmed/33767343
http://dx.doi.org/10.1038/s41436-021-01148-3
work_keys_str_mv AT holtjamesm reducingsangerconfirmationtestingthroughfalsepositivepredictionalgorithms
AT kellymelissa reducingsangerconfirmationtestingthroughfalsepositivepredictionalgorithms
AT sundlofbrett reducingsangerconfirmationtestingthroughfalsepositivepredictionalgorithms
AT nakouzighunwa reducingsangerconfirmationtestingthroughfalsepositivepredictionalgorithms
AT bickdavid reducingsangerconfirmationtestingthroughfalsepositivepredictionalgorithms
AT lyonelaine reducingsangerconfirmationtestingthroughfalsepositivepredictionalgorithms