Cargando…

Genetic Analysis Workshop 18 single-nucleotide variant prioritization based on protein impact, sequence conservation, and gene annotation

Grouping variants based on gene mapping can augment the power of rare variant association tests. Weighting or sorting variants based on their expected functional impact can provide additional benefit. We defined groups of prioritized variants based on systematic annotation of Genetic Analysis Worksh...

Descripción completa

Detalles Bibliográficos
Autores principales: Nalpathamkalam, Thomas, Derkach, Andriy, Paterson, Andrew D, Merico, Daniele
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4143669/
https://www.ncbi.nlm.nih.gov/pubmed/25519362
http://dx.doi.org/10.1186/1753-6561-8-S1-S11
_version_ 1782331934892883968
author Nalpathamkalam, Thomas
Derkach, Andriy
Paterson, Andrew D
Merico, Daniele
author_facet Nalpathamkalam, Thomas
Derkach, Andriy
Paterson, Andrew D
Merico, Daniele
author_sort Nalpathamkalam, Thomas
collection PubMed
description Grouping variants based on gene mapping can augment the power of rare variant association tests. Weighting or sorting variants based on their expected functional impact can provide additional benefit. We defined groups of prioritized variants based on systematic annotation of Genetic Analysis Workshop 18 (GAW18) single-nucleotide variants; we focused on variants detected by whole genome sequencing, specifically on the high-quality subset presented in the genotype files. First, we divided variants between coding and noncoding. Coding variants are fewer than 1% of the total and are more likely to have a biological effect than noncoding variants. Coding variants were further stratified into protein changing and protein damaging groups based on the effect on protein amino acid sequence. In particular, missense variants predicted to be damaging, splice-site alterations, and stop gains were assigned to the protein damaging category. Impact of noncoding variants is more difficult to predict. We decided to rely uniquely on conservation: we combined (a) the mammalian phastCons Conserved Element and (b) the PhyloP score, which identify conserved intervals and the single-nucleotide position, respectively. This reduced the noncoding variants to a number comparable to coding variants. Finally, using gene structure definition from the widely used RefSeq database, we mapped variants to genes to support association tests that require collapsing rare variants to genes. Companion GAW18 papers used these variant priority groups and gene mapping; one of these paper specifically found evidence of stronger association signal for protein damaging variants.
format Online
Article
Text
id pubmed-4143669
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-41436692014-09-02 Genetic Analysis Workshop 18 single-nucleotide variant prioritization based on protein impact, sequence conservation, and gene annotation Nalpathamkalam, Thomas Derkach, Andriy Paterson, Andrew D Merico, Daniele BMC Proc Proceedings Grouping variants based on gene mapping can augment the power of rare variant association tests. Weighting or sorting variants based on their expected functional impact can provide additional benefit. We defined groups of prioritized variants based on systematic annotation of Genetic Analysis Workshop 18 (GAW18) single-nucleotide variants; we focused on variants detected by whole genome sequencing, specifically on the high-quality subset presented in the genotype files. First, we divided variants between coding and noncoding. Coding variants are fewer than 1% of the total and are more likely to have a biological effect than noncoding variants. Coding variants were further stratified into protein changing and protein damaging groups based on the effect on protein amino acid sequence. In particular, missense variants predicted to be damaging, splice-site alterations, and stop gains were assigned to the protein damaging category. Impact of noncoding variants is more difficult to predict. We decided to rely uniquely on conservation: we combined (a) the mammalian phastCons Conserved Element and (b) the PhyloP score, which identify conserved intervals and the single-nucleotide position, respectively. This reduced the noncoding variants to a number comparable to coding variants. Finally, using gene structure definition from the widely used RefSeq database, we mapped variants to genes to support association tests that require collapsing rare variants to genes. Companion GAW18 papers used these variant priority groups and gene mapping; one of these paper specifically found evidence of stronger association signal for protein damaging variants. BioMed Central 2014-06-17 /pmc/articles/PMC4143669/ /pubmed/25519362 http://dx.doi.org/10.1186/1753-6561-8-S1-S11 Text en Copyright © 2014 Nalpathamkalam et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Proceedings
Nalpathamkalam, Thomas
Derkach, Andriy
Paterson, Andrew D
Merico, Daniele
Genetic Analysis Workshop 18 single-nucleotide variant prioritization based on protein impact, sequence conservation, and gene annotation
title Genetic Analysis Workshop 18 single-nucleotide variant prioritization based on protein impact, sequence conservation, and gene annotation
title_full Genetic Analysis Workshop 18 single-nucleotide variant prioritization based on protein impact, sequence conservation, and gene annotation
title_fullStr Genetic Analysis Workshop 18 single-nucleotide variant prioritization based on protein impact, sequence conservation, and gene annotation
title_full_unstemmed Genetic Analysis Workshop 18 single-nucleotide variant prioritization based on protein impact, sequence conservation, and gene annotation
title_short Genetic Analysis Workshop 18 single-nucleotide variant prioritization based on protein impact, sequence conservation, and gene annotation
title_sort genetic analysis workshop 18 single-nucleotide variant prioritization based on protein impact, sequence conservation, and gene annotation
topic Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4143669/
https://www.ncbi.nlm.nih.gov/pubmed/25519362
http://dx.doi.org/10.1186/1753-6561-8-S1-S11
work_keys_str_mv AT nalpathamkalamthomas geneticanalysisworkshop18singlenucleotidevariantprioritizationbasedonproteinimpactsequenceconservationandgeneannotation
AT derkachandriy geneticanalysisworkshop18singlenucleotidevariantprioritizationbasedonproteinimpactsequenceconservationandgeneannotation
AT patersonandrewd geneticanalysisworkshop18singlenucleotidevariantprioritizationbasedonproteinimpactsequenceconservationandgeneannotation
AT mericodaniele geneticanalysisworkshop18singlenucleotidevariantprioritizationbasedonproteinimpactsequenceconservationandgeneannotation