Cargando…

A framework for the estimation of the proportion of true discoveries in single nucleotide variant detection studies for human data

Any single nucleotide variant detection study could benefit from a fast and cheap method of measuring the quality of variant call list. It is advantageous to be able to see how the call list quality is affected by different variant filtering thresholds and other adjustments to the study parameters....

Descripción completa

Detalles Bibliográficos
Autor principal: Tuzov, Nik
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5918994/
https://www.ncbi.nlm.nih.gov/pubmed/29694377
http://dx.doi.org/10.1371/journal.pone.0196058
_version_ 1783317538927542272
author Tuzov, Nik
author_facet Tuzov, Nik
author_sort Tuzov, Nik
collection PubMed
description Any single nucleotide variant detection study could benefit from a fast and cheap method of measuring the quality of variant call list. It is advantageous to be able to see how the call list quality is affected by different variant filtering thresholds and other adjustments to the study parameters. Here we look into a possibility of estimating the proportion of true positives in a single nucleotide variant call list for human data. Using whole-exome and whole-genome gold standard data sets for training, we focus on building a generic model that only relies on information available from any variant caller. We assess and compare the performance of different candidate models based on their practical accuracy. We find that the generic model delivers decent accuracy most of the time. Further, we conclude that its performance could be improved substantially by leveraging the variant quality metrics that are specific to each variant calling tool.
format Online
Article
Text
id pubmed-5918994
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-59189942018-05-05 A framework for the estimation of the proportion of true discoveries in single nucleotide variant detection studies for human data Tuzov, Nik PLoS One Research Article Any single nucleotide variant detection study could benefit from a fast and cheap method of measuring the quality of variant call list. It is advantageous to be able to see how the call list quality is affected by different variant filtering thresholds and other adjustments to the study parameters. Here we look into a possibility of estimating the proportion of true positives in a single nucleotide variant call list for human data. Using whole-exome and whole-genome gold standard data sets for training, we focus on building a generic model that only relies on information available from any variant caller. We assess and compare the performance of different candidate models based on their practical accuracy. We find that the generic model delivers decent accuracy most of the time. Further, we conclude that its performance could be improved substantially by leveraging the variant quality metrics that are specific to each variant calling tool. Public Library of Science 2018-04-25 /pmc/articles/PMC5918994/ /pubmed/29694377 http://dx.doi.org/10.1371/journal.pone.0196058 Text en © 2018 Nik Tuzov http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Tuzov, Nik
A framework for the estimation of the proportion of true discoveries in single nucleotide variant detection studies for human data
title A framework for the estimation of the proportion of true discoveries in single nucleotide variant detection studies for human data
title_full A framework for the estimation of the proportion of true discoveries in single nucleotide variant detection studies for human data
title_fullStr A framework for the estimation of the proportion of true discoveries in single nucleotide variant detection studies for human data
title_full_unstemmed A framework for the estimation of the proportion of true discoveries in single nucleotide variant detection studies for human data
title_short A framework for the estimation of the proportion of true discoveries in single nucleotide variant detection studies for human data
title_sort framework for the estimation of the proportion of true discoveries in single nucleotide variant detection studies for human data
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5918994/
https://www.ncbi.nlm.nih.gov/pubmed/29694377
http://dx.doi.org/10.1371/journal.pone.0196058
work_keys_str_mv AT tuzovnik aframeworkfortheestimationoftheproportionoftruediscoveriesinsinglenucleotidevariantdetectionstudiesforhumandata
AT tuzovnik frameworkfortheestimationoftheproportionoftruediscoveriesinsinglenucleotidevariantdetectionstudiesforhumandata