Cargando…

CLIMB: High-dimensional association detection in large scale genomic data

Joint analyses of genomic datasets obtained in multiple different conditions are essential for understanding the biological mechanism that drives tissue-specificity and cell differentiation, but they still remain computationally challenging. To address this we introduce CLIMB (Composite LIkelihood e...

Descripción completa

Detalles Bibliográficos
Autores principales: Koch, Hillary, Keller, Cheryl A., Xiang, Guanjue, Giardine, Belinda, Zhang, Feipeng, Wang, Yicheng, Hardison, Ross C., Li, Qunhua
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9653391/
https://www.ncbi.nlm.nih.gov/pubmed/36371401
http://dx.doi.org/10.1038/s41467-022-34360-z
_version_ 1784828672050462720
author Koch, Hillary
Keller, Cheryl A.
Xiang, Guanjue
Giardine, Belinda
Zhang, Feipeng
Wang, Yicheng
Hardison, Ross C.
Li, Qunhua
author_facet Koch, Hillary
Keller, Cheryl A.
Xiang, Guanjue
Giardine, Belinda
Zhang, Feipeng
Wang, Yicheng
Hardison, Ross C.
Li, Qunhua
author_sort Koch, Hillary
collection PubMed
description Joint analyses of genomic datasets obtained in multiple different conditions are essential for understanding the biological mechanism that drives tissue-specificity and cell differentiation, but they still remain computationally challenging. To address this we introduce CLIMB (Composite LIkelihood eMpirical Bayes), a statistical methodology that learns patterns of condition-specificity present in genomic data. CLIMB provides a generic framework facilitating a host of analyses, such as clustering genomic features sharing similar condition-specific patterns and identifying which of these features are involved in cell fate commitment. We apply CLIMB to three sets of hematopoietic data, which examine CTCF ChIP-seq measured in 17 different cell populations, RNA-seq measured across constituent cell populations in three committed lineages, and DNase-seq in 38 cell populations. Our results show that CLIMB improves upon existing alternatives in statistical precision, while capturing interpretable and biologically relevant clusters in the data.
format Online
Article
Text
id pubmed-9653391
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-96533912022-11-15 CLIMB: High-dimensional association detection in large scale genomic data Koch, Hillary Keller, Cheryl A. Xiang, Guanjue Giardine, Belinda Zhang, Feipeng Wang, Yicheng Hardison, Ross C. Li, Qunhua Nat Commun Article Joint analyses of genomic datasets obtained in multiple different conditions are essential for understanding the biological mechanism that drives tissue-specificity and cell differentiation, but they still remain computationally challenging. To address this we introduce CLIMB (Composite LIkelihood eMpirical Bayes), a statistical methodology that learns patterns of condition-specificity present in genomic data. CLIMB provides a generic framework facilitating a host of analyses, such as clustering genomic features sharing similar condition-specific patterns and identifying which of these features are involved in cell fate commitment. We apply CLIMB to three sets of hematopoietic data, which examine CTCF ChIP-seq measured in 17 different cell populations, RNA-seq measured across constituent cell populations in three committed lineages, and DNase-seq in 38 cell populations. Our results show that CLIMB improves upon existing alternatives in statistical precision, while capturing interpretable and biologically relevant clusters in the data. Nature Publishing Group UK 2022-11-12 /pmc/articles/PMC9653391/ /pubmed/36371401 http://dx.doi.org/10.1038/s41467-022-34360-z Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Koch, Hillary
Keller, Cheryl A.
Xiang, Guanjue
Giardine, Belinda
Zhang, Feipeng
Wang, Yicheng
Hardison, Ross C.
Li, Qunhua
CLIMB: High-dimensional association detection in large scale genomic data
title CLIMB: High-dimensional association detection in large scale genomic data
title_full CLIMB: High-dimensional association detection in large scale genomic data
title_fullStr CLIMB: High-dimensional association detection in large scale genomic data
title_full_unstemmed CLIMB: High-dimensional association detection in large scale genomic data
title_short CLIMB: High-dimensional association detection in large scale genomic data
title_sort climb: high-dimensional association detection in large scale genomic data
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9653391/
https://www.ncbi.nlm.nih.gov/pubmed/36371401
http://dx.doi.org/10.1038/s41467-022-34360-z
work_keys_str_mv AT kochhillary climbhighdimensionalassociationdetectioninlargescalegenomicdata
AT kellercheryla climbhighdimensionalassociationdetectioninlargescalegenomicdata
AT xiangguanjue climbhighdimensionalassociationdetectioninlargescalegenomicdata
AT giardinebelinda climbhighdimensionalassociationdetectioninlargescalegenomicdata
AT zhangfeipeng climbhighdimensionalassociationdetectioninlargescalegenomicdata
AT wangyicheng climbhighdimensionalassociationdetectioninlargescalegenomicdata
AT hardisonrossc climbhighdimensionalassociationdetectioninlargescalegenomicdata
AT liqunhua climbhighdimensionalassociationdetectioninlargescalegenomicdata