Cargando…

Systematic discovery of conservation states for single-nucleotide annotation of the human genome

Comparative genomics sequence data is an important source of information for interpreting genomes. Genome-wide annotations based on this data have largely focused on univariate scores or binary elements of evolutionary constraint. Here we present a complementary whole genome annotation approach, Con...

Descripción completa

Detalles Bibliográficos
Autores principales: Arneson, Adriana, Ernst, Jason
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6606595/
https://www.ncbi.nlm.nih.gov/pubmed/31286065
http://dx.doi.org/10.1038/s42003-019-0488-1
_version_ 1783431925288927232
author Arneson, Adriana
Ernst, Jason
author_facet Arneson, Adriana
Ernst, Jason
author_sort Arneson, Adriana
collection PubMed
description Comparative genomics sequence data is an important source of information for interpreting genomes. Genome-wide annotations based on this data have largely focused on univariate scores or binary elements of evolutionary constraint. Here we present a complementary whole genome annotation approach, ConsHMM, which applies a multivariate hidden Markov model to learn de novo ‘conservation states’ based on the combinatorial and spatial patterns of which species align to and match a reference genome in a multiple species DNA sequence alignment. We applied ConsHMM to a 100-way vertebrate sequence alignment to annotate the human genome at single nucleotide resolution into 100 conservation states. These states have distinct enrichments for other genomic information including gene annotations, chromatin states, repeat families, and bases prioritized by various variant prioritization scores. Constrained elements have distinct heritability partitioning enrichments depending on their conservation state assignment. ConsHMM conservation states are a resource for analyzing genomes and genetic variants.
format Online
Article
Text
id pubmed-6606595
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-66065952019-07-08 Systematic discovery of conservation states for single-nucleotide annotation of the human genome Arneson, Adriana Ernst, Jason Commun Biol Article Comparative genomics sequence data is an important source of information for interpreting genomes. Genome-wide annotations based on this data have largely focused on univariate scores or binary elements of evolutionary constraint. Here we present a complementary whole genome annotation approach, ConsHMM, which applies a multivariate hidden Markov model to learn de novo ‘conservation states’ based on the combinatorial and spatial patterns of which species align to and match a reference genome in a multiple species DNA sequence alignment. We applied ConsHMM to a 100-way vertebrate sequence alignment to annotate the human genome at single nucleotide resolution into 100 conservation states. These states have distinct enrichments for other genomic information including gene annotations, chromatin states, repeat families, and bases prioritized by various variant prioritization scores. Constrained elements have distinct heritability partitioning enrichments depending on their conservation state assignment. ConsHMM conservation states are a resource for analyzing genomes and genetic variants. Nature Publishing Group UK 2019-07-02 /pmc/articles/PMC6606595/ /pubmed/31286065 http://dx.doi.org/10.1038/s42003-019-0488-1 Text en © The Author(s) 2019 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
spellingShingle Article
Arneson, Adriana
Ernst, Jason
Systematic discovery of conservation states for single-nucleotide annotation of the human genome
title Systematic discovery of conservation states for single-nucleotide annotation of the human genome
title_full Systematic discovery of conservation states for single-nucleotide annotation of the human genome
title_fullStr Systematic discovery of conservation states for single-nucleotide annotation of the human genome
title_full_unstemmed Systematic discovery of conservation states for single-nucleotide annotation of the human genome
title_short Systematic discovery of conservation states for single-nucleotide annotation of the human genome
title_sort systematic discovery of conservation states for single-nucleotide annotation of the human genome
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6606595/
https://www.ncbi.nlm.nih.gov/pubmed/31286065
http://dx.doi.org/10.1038/s42003-019-0488-1
work_keys_str_mv AT arnesonadriana systematicdiscoveryofconservationstatesforsinglenucleotideannotationofthehumangenome
AT ernstjason systematicdiscoveryofconservationstatesforsinglenucleotideannotationofthehumangenome