Cargando…
Systematic discovery of conservation states for single-nucleotide annotation of the human genome
Comparative genomics sequence data is an important source of information for interpreting genomes. Genome-wide annotations based on this data have largely focused on univariate scores or binary elements of evolutionary constraint. Here we present a complementary whole genome annotation approach, Con...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6606595/ https://www.ncbi.nlm.nih.gov/pubmed/31286065 http://dx.doi.org/10.1038/s42003-019-0488-1 |
_version_ | 1783431925288927232 |
---|---|
author | Arneson, Adriana Ernst, Jason |
author_facet | Arneson, Adriana Ernst, Jason |
author_sort | Arneson, Adriana |
collection | PubMed |
description | Comparative genomics sequence data is an important source of information for interpreting genomes. Genome-wide annotations based on this data have largely focused on univariate scores or binary elements of evolutionary constraint. Here we present a complementary whole genome annotation approach, ConsHMM, which applies a multivariate hidden Markov model to learn de novo ‘conservation states’ based on the combinatorial and spatial patterns of which species align to and match a reference genome in a multiple species DNA sequence alignment. We applied ConsHMM to a 100-way vertebrate sequence alignment to annotate the human genome at single nucleotide resolution into 100 conservation states. These states have distinct enrichments for other genomic information including gene annotations, chromatin states, repeat families, and bases prioritized by various variant prioritization scores. Constrained elements have distinct heritability partitioning enrichments depending on their conservation state assignment. ConsHMM conservation states are a resource for analyzing genomes and genetic variants. |
format | Online Article Text |
id | pubmed-6606595 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-66065952019-07-08 Systematic discovery of conservation states for single-nucleotide annotation of the human genome Arneson, Adriana Ernst, Jason Commun Biol Article Comparative genomics sequence data is an important source of information for interpreting genomes. Genome-wide annotations based on this data have largely focused on univariate scores or binary elements of evolutionary constraint. Here we present a complementary whole genome annotation approach, ConsHMM, which applies a multivariate hidden Markov model to learn de novo ‘conservation states’ based on the combinatorial and spatial patterns of which species align to and match a reference genome in a multiple species DNA sequence alignment. We applied ConsHMM to a 100-way vertebrate sequence alignment to annotate the human genome at single nucleotide resolution into 100 conservation states. These states have distinct enrichments for other genomic information including gene annotations, chromatin states, repeat families, and bases prioritized by various variant prioritization scores. Constrained elements have distinct heritability partitioning enrichments depending on their conservation state assignment. ConsHMM conservation states are a resource for analyzing genomes and genetic variants. Nature Publishing Group UK 2019-07-02 /pmc/articles/PMC6606595/ /pubmed/31286065 http://dx.doi.org/10.1038/s42003-019-0488-1 Text en © The Author(s) 2019 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/. |
spellingShingle | Article Arneson, Adriana Ernst, Jason Systematic discovery of conservation states for single-nucleotide annotation of the human genome |
title | Systematic discovery of conservation states for single-nucleotide annotation of the human genome |
title_full | Systematic discovery of conservation states for single-nucleotide annotation of the human genome |
title_fullStr | Systematic discovery of conservation states for single-nucleotide annotation of the human genome |
title_full_unstemmed | Systematic discovery of conservation states for single-nucleotide annotation of the human genome |
title_short | Systematic discovery of conservation states for single-nucleotide annotation of the human genome |
title_sort | systematic discovery of conservation states for single-nucleotide annotation of the human genome |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6606595/ https://www.ncbi.nlm.nih.gov/pubmed/31286065 http://dx.doi.org/10.1038/s42003-019-0488-1 |
work_keys_str_mv | AT arnesonadriana systematicdiscoveryofconservationstatesforsinglenucleotideannotationofthehumangenome AT ernstjason systematicdiscoveryofconservationstatesforsinglenucleotideannotationofthehumangenome |