Cargando…

The VAAST Variant Prioritizer (VVP): ultrafast, easy to use whole genome variant prioritization tool

BACKGROUND: Prioritization of sequence variants for diagnosis and discovery of Mendelian diseases is challenging, especially in large collections of whole genome sequences (WGS). Fast, scalable solutions are needed for discovery research, for clinical applications, and for curation of massive public...

Descripción completa

Detalles Bibliográficos
Autores principales: Flygare, Steven, Hernandez, Edgar Javier, Phan, Lon, Moore, Barry, Li, Man, Fejes, Anthony, Hu, Hao, Eilbeck, Karen, Huff, Chad, Jorde, Lynn, G. Reese, Martin, Yandell, Mark
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5819680/
https://www.ncbi.nlm.nih.gov/pubmed/29463208
http://dx.doi.org/10.1186/s12859-018-2056-y
_version_ 1783301250408775680
author Flygare, Steven
Hernandez, Edgar Javier
Phan, Lon
Moore, Barry
Li, Man
Fejes, Anthony
Hu, Hao
Eilbeck, Karen
Huff, Chad
Jorde, Lynn
G. Reese, Martin
Yandell, Mark
author_facet Flygare, Steven
Hernandez, Edgar Javier
Phan, Lon
Moore, Barry
Li, Man
Fejes, Anthony
Hu, Hao
Eilbeck, Karen
Huff, Chad
Jorde, Lynn
G. Reese, Martin
Yandell, Mark
author_sort Flygare, Steven
collection PubMed
description BACKGROUND: Prioritization of sequence variants for diagnosis and discovery of Mendelian diseases is challenging, especially in large collections of whole genome sequences (WGS). Fast, scalable solutions are needed for discovery research, for clinical applications, and for curation of massive public variant repositories such as dbSNP and gnomAD. In response, we have developed VVP, the VAAST Variant Prioritizer. VVP is ultrafast, scales to even the largest variant repositories and genome collections, and its outputs are designed to simplify clinical interpretation of variants of uncertain significance. RESULTS: We show that scoring the entire contents of dbSNP (> 155 million variants) requires only 95 min using a machine with 4 cpus and 16 GB of RAM, and that a 60X WGS can be processed in less than 5 min. We also demonstrate that VVP can score variants anywhere in the genome, regardless of type, effect, or location. It does so by integrating sequence conservation, the type of sequence change, allele frequencies, variant burden, and zygosity. Finally, we also show that VVP scores are consistently accurate, and easily interpreted, traits not shared by many commonly used tools such as SIFT and CADD. CONCLUSIONS: VVP provides rapid and scalable means to prioritize any sequence variant, anywhere in the genome, and its scores are designed to facilitate variant interpretation using ACMG and NHS guidelines. These traits make it well suited for operation on very large collections of WGS sequences. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-018-2056-y) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-5819680
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-58196802018-02-26 The VAAST Variant Prioritizer (VVP): ultrafast, easy to use whole genome variant prioritization tool Flygare, Steven Hernandez, Edgar Javier Phan, Lon Moore, Barry Li, Man Fejes, Anthony Hu, Hao Eilbeck, Karen Huff, Chad Jorde, Lynn G. Reese, Martin Yandell, Mark BMC Bioinformatics Software BACKGROUND: Prioritization of sequence variants for diagnosis and discovery of Mendelian diseases is challenging, especially in large collections of whole genome sequences (WGS). Fast, scalable solutions are needed for discovery research, for clinical applications, and for curation of massive public variant repositories such as dbSNP and gnomAD. In response, we have developed VVP, the VAAST Variant Prioritizer. VVP is ultrafast, scales to even the largest variant repositories and genome collections, and its outputs are designed to simplify clinical interpretation of variants of uncertain significance. RESULTS: We show that scoring the entire contents of dbSNP (> 155 million variants) requires only 95 min using a machine with 4 cpus and 16 GB of RAM, and that a 60X WGS can be processed in less than 5 min. We also demonstrate that VVP can score variants anywhere in the genome, regardless of type, effect, or location. It does so by integrating sequence conservation, the type of sequence change, allele frequencies, variant burden, and zygosity. Finally, we also show that VVP scores are consistently accurate, and easily interpreted, traits not shared by many commonly used tools such as SIFT and CADD. CONCLUSIONS: VVP provides rapid and scalable means to prioritize any sequence variant, anywhere in the genome, and its scores are designed to facilitate variant interpretation using ACMG and NHS guidelines. These traits make it well suited for operation on very large collections of WGS sequences. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-018-2056-y) contains supplementary material, which is available to authorized users. BioMed Central 2018-02-20 /pmc/articles/PMC5819680/ /pubmed/29463208 http://dx.doi.org/10.1186/s12859-018-2056-y Text en © The Author(s). 2018 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Software
Flygare, Steven
Hernandez, Edgar Javier
Phan, Lon
Moore, Barry
Li, Man
Fejes, Anthony
Hu, Hao
Eilbeck, Karen
Huff, Chad
Jorde, Lynn
G. Reese, Martin
Yandell, Mark
The VAAST Variant Prioritizer (VVP): ultrafast, easy to use whole genome variant prioritization tool
title The VAAST Variant Prioritizer (VVP): ultrafast, easy to use whole genome variant prioritization tool
title_full The VAAST Variant Prioritizer (VVP): ultrafast, easy to use whole genome variant prioritization tool
title_fullStr The VAAST Variant Prioritizer (VVP): ultrafast, easy to use whole genome variant prioritization tool
title_full_unstemmed The VAAST Variant Prioritizer (VVP): ultrafast, easy to use whole genome variant prioritization tool
title_short The VAAST Variant Prioritizer (VVP): ultrafast, easy to use whole genome variant prioritization tool
title_sort vaast variant prioritizer (vvp): ultrafast, easy to use whole genome variant prioritization tool
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5819680/
https://www.ncbi.nlm.nih.gov/pubmed/29463208
http://dx.doi.org/10.1186/s12859-018-2056-y
work_keys_str_mv AT flygaresteven thevaastvariantprioritizervvpultrafasteasytousewholegenomevariantprioritizationtool
AT hernandezedgarjavier thevaastvariantprioritizervvpultrafasteasytousewholegenomevariantprioritizationtool
AT phanlon thevaastvariantprioritizervvpultrafasteasytousewholegenomevariantprioritizationtool
AT moorebarry thevaastvariantprioritizervvpultrafasteasytousewholegenomevariantprioritizationtool
AT liman thevaastvariantprioritizervvpultrafasteasytousewholegenomevariantprioritizationtool
AT fejesanthony thevaastvariantprioritizervvpultrafasteasytousewholegenomevariantprioritizationtool
AT huhao thevaastvariantprioritizervvpultrafasteasytousewholegenomevariantprioritizationtool
AT eilbeckkaren thevaastvariantprioritizervvpultrafasteasytousewholegenomevariantprioritizationtool
AT huffchad thevaastvariantprioritizervvpultrafasteasytousewholegenomevariantprioritizationtool
AT jordelynn thevaastvariantprioritizervvpultrafasteasytousewholegenomevariantprioritizationtool
AT greesemartin thevaastvariantprioritizervvpultrafasteasytousewholegenomevariantprioritizationtool
AT yandellmark thevaastvariantprioritizervvpultrafasteasytousewholegenomevariantprioritizationtool
AT flygaresteven vaastvariantprioritizervvpultrafasteasytousewholegenomevariantprioritizationtool
AT hernandezedgarjavier vaastvariantprioritizervvpultrafasteasytousewholegenomevariantprioritizationtool
AT phanlon vaastvariantprioritizervvpultrafasteasytousewholegenomevariantprioritizationtool
AT moorebarry vaastvariantprioritizervvpultrafasteasytousewholegenomevariantprioritizationtool
AT liman vaastvariantprioritizervvpultrafasteasytousewholegenomevariantprioritizationtool
AT fejesanthony vaastvariantprioritizervvpultrafasteasytousewholegenomevariantprioritizationtool
AT huhao vaastvariantprioritizervvpultrafasteasytousewholegenomevariantprioritizationtool
AT eilbeckkaren vaastvariantprioritizervvpultrafasteasytousewholegenomevariantprioritizationtool
AT huffchad vaastvariantprioritizervvpultrafasteasytousewholegenomevariantprioritizationtool
AT jordelynn vaastvariantprioritizervvpultrafasteasytousewholegenomevariantprioritizationtool
AT greesemartin vaastvariantprioritizervvpultrafasteasytousewholegenomevariantprioritizationtool
AT yandellmark vaastvariantprioritizervvpultrafasteasytousewholegenomevariantprioritizationtool