Cargando…
Genetic Analysis Workshop 18 single-nucleotide variant prioritization based on protein impact, sequence conservation, and gene annotation
Grouping variants based on gene mapping can augment the power of rare variant association tests. Weighting or sorting variants based on their expected functional impact can provide additional benefit. We defined groups of prioritized variants based on systematic annotation of Genetic Analysis Worksh...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4143669/ https://www.ncbi.nlm.nih.gov/pubmed/25519362 http://dx.doi.org/10.1186/1753-6561-8-S1-S11 |
_version_ | 1782331934892883968 |
---|---|
author | Nalpathamkalam, Thomas Derkach, Andriy Paterson, Andrew D Merico, Daniele |
author_facet | Nalpathamkalam, Thomas Derkach, Andriy Paterson, Andrew D Merico, Daniele |
author_sort | Nalpathamkalam, Thomas |
collection | PubMed |
description | Grouping variants based on gene mapping can augment the power of rare variant association tests. Weighting or sorting variants based on their expected functional impact can provide additional benefit. We defined groups of prioritized variants based on systematic annotation of Genetic Analysis Workshop 18 (GAW18) single-nucleotide variants; we focused on variants detected by whole genome sequencing, specifically on the high-quality subset presented in the genotype files. First, we divided variants between coding and noncoding. Coding variants are fewer than 1% of the total and are more likely to have a biological effect than noncoding variants. Coding variants were further stratified into protein changing and protein damaging groups based on the effect on protein amino acid sequence. In particular, missense variants predicted to be damaging, splice-site alterations, and stop gains were assigned to the protein damaging category. Impact of noncoding variants is more difficult to predict. We decided to rely uniquely on conservation: we combined (a) the mammalian phastCons Conserved Element and (b) the PhyloP score, which identify conserved intervals and the single-nucleotide position, respectively. This reduced the noncoding variants to a number comparable to coding variants. Finally, using gene structure definition from the widely used RefSeq database, we mapped variants to genes to support association tests that require collapsing rare variants to genes. Companion GAW18 papers used these variant priority groups and gene mapping; one of these paper specifically found evidence of stronger association signal for protein damaging variants. |
format | Online Article Text |
id | pubmed-4143669 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-41436692014-09-02 Genetic Analysis Workshop 18 single-nucleotide variant prioritization based on protein impact, sequence conservation, and gene annotation Nalpathamkalam, Thomas Derkach, Andriy Paterson, Andrew D Merico, Daniele BMC Proc Proceedings Grouping variants based on gene mapping can augment the power of rare variant association tests. Weighting or sorting variants based on their expected functional impact can provide additional benefit. We defined groups of prioritized variants based on systematic annotation of Genetic Analysis Workshop 18 (GAW18) single-nucleotide variants; we focused on variants detected by whole genome sequencing, specifically on the high-quality subset presented in the genotype files. First, we divided variants between coding and noncoding. Coding variants are fewer than 1% of the total and are more likely to have a biological effect than noncoding variants. Coding variants were further stratified into protein changing and protein damaging groups based on the effect on protein amino acid sequence. In particular, missense variants predicted to be damaging, splice-site alterations, and stop gains were assigned to the protein damaging category. Impact of noncoding variants is more difficult to predict. We decided to rely uniquely on conservation: we combined (a) the mammalian phastCons Conserved Element and (b) the PhyloP score, which identify conserved intervals and the single-nucleotide position, respectively. This reduced the noncoding variants to a number comparable to coding variants. Finally, using gene structure definition from the widely used RefSeq database, we mapped variants to genes to support association tests that require collapsing rare variants to genes. Companion GAW18 papers used these variant priority groups and gene mapping; one of these paper specifically found evidence of stronger association signal for protein damaging variants. BioMed Central 2014-06-17 /pmc/articles/PMC4143669/ /pubmed/25519362 http://dx.doi.org/10.1186/1753-6561-8-S1-S11 Text en Copyright © 2014 Nalpathamkalam et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Proceedings Nalpathamkalam, Thomas Derkach, Andriy Paterson, Andrew D Merico, Daniele Genetic Analysis Workshop 18 single-nucleotide variant prioritization based on protein impact, sequence conservation, and gene annotation |
title | Genetic Analysis Workshop 18 single-nucleotide variant prioritization based on protein impact, sequence conservation, and gene annotation |
title_full | Genetic Analysis Workshop 18 single-nucleotide variant prioritization based on protein impact, sequence conservation, and gene annotation |
title_fullStr | Genetic Analysis Workshop 18 single-nucleotide variant prioritization based on protein impact, sequence conservation, and gene annotation |
title_full_unstemmed | Genetic Analysis Workshop 18 single-nucleotide variant prioritization based on protein impact, sequence conservation, and gene annotation |
title_short | Genetic Analysis Workshop 18 single-nucleotide variant prioritization based on protein impact, sequence conservation, and gene annotation |
title_sort | genetic analysis workshop 18 single-nucleotide variant prioritization based on protein impact, sequence conservation, and gene annotation |
topic | Proceedings |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4143669/ https://www.ncbi.nlm.nih.gov/pubmed/25519362 http://dx.doi.org/10.1186/1753-6561-8-S1-S11 |
work_keys_str_mv | AT nalpathamkalamthomas geneticanalysisworkshop18singlenucleotidevariantprioritizationbasedonproteinimpactsequenceconservationandgeneannotation AT derkachandriy geneticanalysisworkshop18singlenucleotidevariantprioritizationbasedonproteinimpactsequenceconservationandgeneannotation AT patersonandrewd geneticanalysisworkshop18singlenucleotidevariantprioritizationbasedonproteinimpactsequenceconservationandgeneannotation AT mericodaniele geneticanalysisworkshop18singlenucleotidevariantprioritizationbasedonproteinimpactsequenceconservationandgeneannotation |