Cargando…

VAAST 2.0: Improved Variant Classification and Disease-Gene Identification Using a Conservation-Controlled Amino Acid Substitution Matrix

The need for improved algorithmic support for variant prioritization and disease-gene identification in personal genomes data is widely acknowledged. We previously presented the Variant Annotation, Analysis, and Search Tool (VAAST), which employs an aggregative variant association test that combines...

Descripción completa

Detalles Bibliográficos
Autores principales: Hu, Hao, Huff, Chad D, Moore, Barry, Flygare, Steven, Reese, Martin G, Yandell, Mark
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Blackwell Publishing Ltd 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3791556/
https://www.ncbi.nlm.nih.gov/pubmed/23836555
http://dx.doi.org/10.1002/gepi.21743
_version_ 1782286737275355136
author Hu, Hao
Huff, Chad D
Moore, Barry
Flygare, Steven
Reese, Martin G
Yandell, Mark
author_facet Hu, Hao
Huff, Chad D
Moore, Barry
Flygare, Steven
Reese, Martin G
Yandell, Mark
author_sort Hu, Hao
collection PubMed
description The need for improved algorithmic support for variant prioritization and disease-gene identification in personal genomes data is widely acknowledged. We previously presented the Variant Annotation, Analysis, and Search Tool (VAAST), which employs an aggregative variant association test that combines both amino acid substitution (AAS) and allele frequencies. Here we describe and benchmark VAAST 2.0, which uses a novel conservation-controlled AAS matrix (CASM), to incorporate information about phylogenetic conservation. We show that the CASM approach improves VAAST’s variant prioritization accuracy compared to its previous implementation, and compared to SIFT, PolyPhen-2, and MutationTaster. We also show that VAAST 2.0 outperforms KBAC, WSS, SKAT, and variable threshold (VT) using published case-control datasets for Crohn disease (NOD2), hypertriglyceridemia (LPL), and breast cancer (CHEK2). VAAST 2.0 also improves search accuracy on simulated datasets across a wide range of allele frequencies, population-attributable disease risks, and allelic heterogeneity, factors that compromise the accuracies of other aggregative variant association tests. We also demonstrate that, although most aggregative variant association tests are designed for common genetic diseases, these tests can be easily adopted as rare Mendelian disease-gene finders with a simple ranking-by-statistical-significance protocol, and the performance compares very favorably to state-of-art filtering approaches. The latter, despite their popularity, have suboptimal performance especially with the increasing case sample size.
format Online
Article
Text
id pubmed-3791556
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher Blackwell Publishing Ltd
record_format MEDLINE/PubMed
spelling pubmed-37915562013-10-08 VAAST 2.0: Improved Variant Classification and Disease-Gene Identification Using a Conservation-Controlled Amino Acid Substitution Matrix Hu, Hao Huff, Chad D Moore, Barry Flygare, Steven Reese, Martin G Yandell, Mark Genet Epidemiol Research Articles The need for improved algorithmic support for variant prioritization and disease-gene identification in personal genomes data is widely acknowledged. We previously presented the Variant Annotation, Analysis, and Search Tool (VAAST), which employs an aggregative variant association test that combines both amino acid substitution (AAS) and allele frequencies. Here we describe and benchmark VAAST 2.0, which uses a novel conservation-controlled AAS matrix (CASM), to incorporate information about phylogenetic conservation. We show that the CASM approach improves VAAST’s variant prioritization accuracy compared to its previous implementation, and compared to SIFT, PolyPhen-2, and MutationTaster. We also show that VAAST 2.0 outperforms KBAC, WSS, SKAT, and variable threshold (VT) using published case-control datasets for Crohn disease (NOD2), hypertriglyceridemia (LPL), and breast cancer (CHEK2). VAAST 2.0 also improves search accuracy on simulated datasets across a wide range of allele frequencies, population-attributable disease risks, and allelic heterogeneity, factors that compromise the accuracies of other aggregative variant association tests. We also demonstrate that, although most aggregative variant association tests are designed for common genetic diseases, these tests can be easily adopted as rare Mendelian disease-gene finders with a simple ranking-by-statistical-significance protocol, and the performance compares very favorably to state-of-art filtering approaches. The latter, despite their popularity, have suboptimal performance especially with the increasing case sample size. Blackwell Publishing Ltd 2013-09 2013-07-08 /pmc/articles/PMC3791556/ /pubmed/23836555 http://dx.doi.org/10.1002/gepi.21743 Text en © 2013 Wiley Periodicals, Inc. http://creativecommons.org/licenses/by/2.5/ Re-use of this article is permitted in accordance with the Creative Commons Deed, Attribution 2.5, which does not permit commercial exploitation.
spellingShingle Research Articles
Hu, Hao
Huff, Chad D
Moore, Barry
Flygare, Steven
Reese, Martin G
Yandell, Mark
VAAST 2.0: Improved Variant Classification and Disease-Gene Identification Using a Conservation-Controlled Amino Acid Substitution Matrix
title VAAST 2.0: Improved Variant Classification and Disease-Gene Identification Using a Conservation-Controlled Amino Acid Substitution Matrix
title_full VAAST 2.0: Improved Variant Classification and Disease-Gene Identification Using a Conservation-Controlled Amino Acid Substitution Matrix
title_fullStr VAAST 2.0: Improved Variant Classification and Disease-Gene Identification Using a Conservation-Controlled Amino Acid Substitution Matrix
title_full_unstemmed VAAST 2.0: Improved Variant Classification and Disease-Gene Identification Using a Conservation-Controlled Amino Acid Substitution Matrix
title_short VAAST 2.0: Improved Variant Classification and Disease-Gene Identification Using a Conservation-Controlled Amino Acid Substitution Matrix
title_sort vaast 2.0: improved variant classification and disease-gene identification using a conservation-controlled amino acid substitution matrix
topic Research Articles
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3791556/
https://www.ncbi.nlm.nih.gov/pubmed/23836555
http://dx.doi.org/10.1002/gepi.21743
work_keys_str_mv AT huhao vaast20improvedvariantclassificationanddiseasegeneidentificationusingaconservationcontrolledaminoacidsubstitutionmatrix
AT huffchadd vaast20improvedvariantclassificationanddiseasegeneidentificationusingaconservationcontrolledaminoacidsubstitutionmatrix
AT moorebarry vaast20improvedvariantclassificationanddiseasegeneidentificationusingaconservationcontrolledaminoacidsubstitutionmatrix
AT flygaresteven vaast20improvedvariantclassificationanddiseasegeneidentificationusingaconservationcontrolledaminoacidsubstitutionmatrix
AT reesemarting vaast20improvedvariantclassificationanddiseasegeneidentificationusingaconservationcontrolledaminoacidsubstitutionmatrix
AT yandellmark vaast20improvedvariantclassificationanddiseasegeneidentificationusingaconservationcontrolledaminoacidsubstitutionmatrix