Cargando…

A general framework for estimating the relative pathogenicity of human genetic variants

Our capacity to sequence human genomes has exceeded our ability to interpret genetic variation. Current genomic annotations tend to exploit a single information type (e.g. conservation) and/or are restricted in scope (e.g. to missense changes). Here, we describe Combined Annotation Dependent Depleti...

Descripción completa

Detalles Bibliográficos
Autores principales: Kircher, Martin, Witten, Daniela M., Jain, Preti, O’Roak, Brian J., Cooper, Gregory M., Shendure, Jay
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3992975/
https://www.ncbi.nlm.nih.gov/pubmed/24487276
http://dx.doi.org/10.1038/ng.2892
_version_ 1782312613904908288
author Kircher, Martin
Witten, Daniela M.
Jain, Preti
O’Roak, Brian J.
Cooper, Gregory M.
Shendure, Jay
author_facet Kircher, Martin
Witten, Daniela M.
Jain, Preti
O’Roak, Brian J.
Cooper, Gregory M.
Shendure, Jay
author_sort Kircher, Martin
collection PubMed
description Our capacity to sequence human genomes has exceeded our ability to interpret genetic variation. Current genomic annotations tend to exploit a single information type (e.g. conservation) and/or are restricted in scope (e.g. to missense changes). Here, we describe Combined Annotation Dependent Depletion (CADD), a framework that objectively integrates many diverse annotations into a single, quantitative score. We implement CADD as a support vector machine trained to differentiate 14.7 million high-frequency human derived alleles from 14.7 million simulated variants. We pre-compute “C-scores” for all 8.6 billion possible human single nucleotide variants and enable scoring of short insertions/deletions. C-scores correlate with allelic diversity, annotations of functionality, pathogenicity, disease severity, experimentally measured regulatory effects, and complex trait associations, and highly rank known pathogenic variants within individual genomes. The ability of CADD to prioritize functional, deleterious, and pathogenic variants across many functional categories, effect sizes and genetic architectures is unmatched by any current annotation.
format Online
Article
Text
id pubmed-3992975
institution National Center for Biotechnology Information
language English
publishDate 2014
record_format MEDLINE/PubMed
spelling pubmed-39929752014-09-01 A general framework for estimating the relative pathogenicity of human genetic variants Kircher, Martin Witten, Daniela M. Jain, Preti O’Roak, Brian J. Cooper, Gregory M. Shendure, Jay Nat Genet Article Our capacity to sequence human genomes has exceeded our ability to interpret genetic variation. Current genomic annotations tend to exploit a single information type (e.g. conservation) and/or are restricted in scope (e.g. to missense changes). Here, we describe Combined Annotation Dependent Depletion (CADD), a framework that objectively integrates many diverse annotations into a single, quantitative score. We implement CADD as a support vector machine trained to differentiate 14.7 million high-frequency human derived alleles from 14.7 million simulated variants. We pre-compute “C-scores” for all 8.6 billion possible human single nucleotide variants and enable scoring of short insertions/deletions. C-scores correlate with allelic diversity, annotations of functionality, pathogenicity, disease severity, experimentally measured regulatory effects, and complex trait associations, and highly rank known pathogenic variants within individual genomes. The ability of CADD to prioritize functional, deleterious, and pathogenic variants across many functional categories, effect sizes and genetic architectures is unmatched by any current annotation. 2014-02-02 2014-03 /pmc/articles/PMC3992975/ /pubmed/24487276 http://dx.doi.org/10.1038/ng.2892 Text en Users may view, print, copy, download and text and data- mine the content in such documents, for the purposes of academic research, subject always to the full Conditions of use: http://www.nature.com/authors/editorial_policies/license.html#terms
spellingShingle Article
Kircher, Martin
Witten, Daniela M.
Jain, Preti
O’Roak, Brian J.
Cooper, Gregory M.
Shendure, Jay
A general framework for estimating the relative pathogenicity of human genetic variants
title A general framework for estimating the relative pathogenicity of human genetic variants
title_full A general framework for estimating the relative pathogenicity of human genetic variants
title_fullStr A general framework for estimating the relative pathogenicity of human genetic variants
title_full_unstemmed A general framework for estimating the relative pathogenicity of human genetic variants
title_short A general framework for estimating the relative pathogenicity of human genetic variants
title_sort general framework for estimating the relative pathogenicity of human genetic variants
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3992975/
https://www.ncbi.nlm.nih.gov/pubmed/24487276
http://dx.doi.org/10.1038/ng.2892
work_keys_str_mv AT kirchermartin ageneralframeworkforestimatingtherelativepathogenicityofhumangeneticvariants
AT wittendanielam ageneralframeworkforestimatingtherelativepathogenicityofhumangeneticvariants
AT jainpreti ageneralframeworkforestimatingtherelativepathogenicityofhumangeneticvariants
AT oroakbrianj ageneralframeworkforestimatingtherelativepathogenicityofhumangeneticvariants
AT coopergregorym ageneralframeworkforestimatingtherelativepathogenicityofhumangeneticvariants
AT shendurejay ageneralframeworkforestimatingtherelativepathogenicityofhumangeneticvariants
AT kirchermartin generalframeworkforestimatingtherelativepathogenicityofhumangeneticvariants
AT wittendanielam generalframeworkforestimatingtherelativepathogenicityofhumangeneticvariants
AT jainpreti generalframeworkforestimatingtherelativepathogenicityofhumangeneticvariants
AT oroakbrianj generalframeworkforestimatingtherelativepathogenicityofhumangeneticvariants
AT coopergregorym generalframeworkforestimatingtherelativepathogenicityofhumangeneticvariants
AT shendurejay generalframeworkforestimatingtherelativepathogenicityofhumangeneticvariants