Cargando…

SNPs2ChIP: Latent Factors of ChIP-seq to infer functions of non-coding SNPs

Genetic variations of the human genome are linked to many disease phenotypes. While whole-genome sequencing and genome-wide association studies (GWAS) have uncovered a number of genotype-phenotype associations, their functional interpretation remains challenging given most single nucleotide polymorp...

Descripción completa

Detalles Bibliográficos
Autores principales: Anand, Shankara, Kalesinskas, Laurynas, Smail, Craig, Tanigawa, Yosuke
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6417821/
https://www.ncbi.nlm.nih.gov/pubmed/30864321
_version_ 1783403628389728256
author Anand, Shankara
Kalesinskas, Laurynas
Smail, Craig
Tanigawa, Yosuke
author_facet Anand, Shankara
Kalesinskas, Laurynas
Smail, Craig
Tanigawa, Yosuke
author_sort Anand, Shankara
collection PubMed
description Genetic variations of the human genome are linked to many disease phenotypes. While whole-genome sequencing and genome-wide association studies (GWAS) have uncovered a number of genotype-phenotype associations, their functional interpretation remains challenging given most single nucleotide polymorphisms (SNPs) fall into the non-coding region of the genome. Advances in chromatin immunoprecipitation sequencing (ChIP-seq) have made large-scale repositories of epigenetic data available, allowing investigation of coordinated mechanisms of epigenetic markers and transcriptional regulation and their influence on biological function. To address this, we propose SNPs2ChIP, a method to infer biological functions of non-coding variants through unsupervised statistical learning methods applied to publicly-available epigenetic datasets. We systematically characterized latent factors by applying singular value decomposition to ChIP-seq tracks of lymphoblastoid cell lines, and annotated the biological function of each latent factor using the genomic region enrichment analysis tool. Using these annotated latent factors as reference, we developed SNPs2ChIP, a pipeline that takes genomic region(s) as an input, identifies the relevant latent factors with quantitative scores, and returns them along with their inferred functions. As a case study, we focused on systemic lupus erythematosus and demonstrated our method’s ability to infer relevant biological function. We systematically applied SNPs2ChIP on publicly available datasets, including known GWAS associations from the GWAS catalogue and ChIP-seq peaks from a previously published study. Our approach to leverage latent patterns across genome-wide epigenetic datasets to infer the biological function will advance understanding of the genetics of human diseases by accelerating the interpretation of non-coding genomes.
format Online
Article
Text
id pubmed-6417821
institution National Center for Biotechnology Information
language English
publishDate 2019
record_format MEDLINE/PubMed
spelling pubmed-64178212019-03-14 SNPs2ChIP: Latent Factors of ChIP-seq to infer functions of non-coding SNPs Anand, Shankara Kalesinskas, Laurynas Smail, Craig Tanigawa, Yosuke Pac Symp Biocomput Article Genetic variations of the human genome are linked to many disease phenotypes. While whole-genome sequencing and genome-wide association studies (GWAS) have uncovered a number of genotype-phenotype associations, their functional interpretation remains challenging given most single nucleotide polymorphisms (SNPs) fall into the non-coding region of the genome. Advances in chromatin immunoprecipitation sequencing (ChIP-seq) have made large-scale repositories of epigenetic data available, allowing investigation of coordinated mechanisms of epigenetic markers and transcriptional regulation and their influence on biological function. To address this, we propose SNPs2ChIP, a method to infer biological functions of non-coding variants through unsupervised statistical learning methods applied to publicly-available epigenetic datasets. We systematically characterized latent factors by applying singular value decomposition to ChIP-seq tracks of lymphoblastoid cell lines, and annotated the biological function of each latent factor using the genomic region enrichment analysis tool. Using these annotated latent factors as reference, we developed SNPs2ChIP, a pipeline that takes genomic region(s) as an input, identifies the relevant latent factors with quantitative scores, and returns them along with their inferred functions. As a case study, we focused on systemic lupus erythematosus and demonstrated our method’s ability to infer relevant biological function. We systematically applied SNPs2ChIP on publicly available datasets, including known GWAS associations from the GWAS catalogue and ChIP-seq peaks from a previously published study. Our approach to leverage latent patterns across genome-wide epigenetic datasets to infer the biological function will advance understanding of the genetics of human diseases by accelerating the interpretation of non-coding genomes. 2019 /pmc/articles/PMC6417821/ /pubmed/30864321 Text en Open Access chapter published by World Scientific Publishing Company and distributed under the terms of the Creative Commons Attribution Non-Commercial (CC BY-NC) 4.0 License. http://creativecommons.org/licenses/by/4.0/
spellingShingle Article
Anand, Shankara
Kalesinskas, Laurynas
Smail, Craig
Tanigawa, Yosuke
SNPs2ChIP: Latent Factors of ChIP-seq to infer functions of non-coding SNPs
title SNPs2ChIP: Latent Factors of ChIP-seq to infer functions of non-coding SNPs
title_full SNPs2ChIP: Latent Factors of ChIP-seq to infer functions of non-coding SNPs
title_fullStr SNPs2ChIP: Latent Factors of ChIP-seq to infer functions of non-coding SNPs
title_full_unstemmed SNPs2ChIP: Latent Factors of ChIP-seq to infer functions of non-coding SNPs
title_short SNPs2ChIP: Latent Factors of ChIP-seq to infer functions of non-coding SNPs
title_sort snps2chip: latent factors of chip-seq to infer functions of non-coding snps
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6417821/
https://www.ncbi.nlm.nih.gov/pubmed/30864321
work_keys_str_mv AT anandshankara snps2chiplatentfactorsofchipseqtoinferfunctionsofnoncodingsnps
AT kalesinskaslaurynas snps2chiplatentfactorsofchipseqtoinferfunctionsofnoncodingsnps
AT smailcraig snps2chiplatentfactorsofchipseqtoinferfunctionsofnoncodingsnps
AT tanigawayosuke snps2chiplatentfactorsofchipseqtoinferfunctionsofnoncodingsnps