Cargando…

Predicting the targets of IRF8 and NFATc1 during osteoclast differentiation using the machine learning method framework cTAP

BACKGROUND: Interferon regulatory factor-8 (IRF8) and nuclear factor-activated T cells c1 (NFATc1) are two transcription factors that have an important role in osteoclast differentiation. Thanks to ChIP-seq technology, scientists can now estimate potential genome-wide target genes of IRF8 and NFATc1...

Descripción completa

Detalles Bibliográficos
Autores principales:	Wang, Honglin, Joshi, Pujan, Hong, Seung-Hyun, Maye, Peter F., Rowe, David W., Shin, Dong-Guk
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2022
Materias:	Research
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8740472/ https://www.ncbi.nlm.nih.gov/pubmed/34991467 http://dx.doi.org/10.1186/s12864-021-08159-z

_version_	1784629320440872960
author	Wang, Honglin Joshi, Pujan Hong, Seung-Hyun Maye, Peter F. Rowe, David W. Shin, Dong-Guk
author_facet	Wang, Honglin Joshi, Pujan Hong, Seung-Hyun Maye, Peter F. Rowe, David W. Shin, Dong-Guk
author_sort	Wang, Honglin
collection	PubMed
description	BACKGROUND: Interferon regulatory factor-8 (IRF8) and nuclear factor-activated T cells c1 (NFATc1) are two transcription factors that have an important role in osteoclast differentiation. Thanks to ChIP-seq technology, scientists can now estimate potential genome-wide target genes of IRF8 and NFATc1. However, finding target genes that are consistently up-regulated or down-regulated across different studies is hard because it requires analysis of a large number of high-throughput expression studies from a comparable context. METHOD: We have developed a machine learning based method, called, Cohort-based TF target prediction system (cTAP) to overcome this problem. This method assumes that the pathway involving the transcription factors of interest is featured with multiple “functional groups” of marker genes pertaining to the concerned biological process. It uses two notions, Gene-Present Sufficiently (GP) and Gene-Absent Insufficiently (GA), in addition to log2 fold changes of differentially expressed genes for the prediction. Target prediction is made by applying multiple machine-learning models, which learn the patterns of GP and GA from log2 fold changes and four types of Z scores from the normalized cohort’s gene expression data. The learned patterns are then associated with the putative transcription factor targets to identify genes that consistently exhibit Up/Down gene regulation patterns within the cohort. We applied this method to 11 publicly available GEO data sets related to osteoclastgenesis. RESULT: Our experiment identified a small number of Up/Down IRF8 and NFATc1 target genes as relevant to osteoclast differentiation. The machine learning models using GP and GA produced NFATc1 and IRF8 target genes different than simply using a log2 fold change alone. Our literature survey revealed that all predicted target genes have known roles in bone remodeling, specifically related to the immune system and osteoclast formation and functions, suggesting confidence and validity in our method. CONCLUSION: cTAP was motivated by recognizing that biologists tend to use Z score values present in data sets for the analysis. However, using cTAP effectively presupposes assembling a sizable cohort of gene expression data sets within a comparable context. As public gene expression data repositories grow, the need to use cohort-based analysis method like cTAP will become increasingly important.
format	Online Article Text
id	pubmed-8740472
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-87404722022-01-07 Predicting the targets of IRF8 and NFATc1 during osteoclast differentiation using the machine learning method framework cTAP Wang, Honglin Joshi, Pujan Hong, Seung-Hyun Maye, Peter F. Rowe, David W. Shin, Dong-Guk BMC Genomics Research BACKGROUND: Interferon regulatory factor-8 (IRF8) and nuclear factor-activated T cells c1 (NFATc1) are two transcription factors that have an important role in osteoclast differentiation. Thanks to ChIP-seq technology, scientists can now estimate potential genome-wide target genes of IRF8 and NFATc1. However, finding target genes that are consistently up-regulated or down-regulated across different studies is hard because it requires analysis of a large number of high-throughput expression studies from a comparable context. METHOD: We have developed a machine learning based method, called, Cohort-based TF target prediction system (cTAP) to overcome this problem. This method assumes that the pathway involving the transcription factors of interest is featured with multiple “functional groups” of marker genes pertaining to the concerned biological process. It uses two notions, Gene-Present Sufficiently (GP) and Gene-Absent Insufficiently (GA), in addition to log2 fold changes of differentially expressed genes for the prediction. Target prediction is made by applying multiple machine-learning models, which learn the patterns of GP and GA from log2 fold changes and four types of Z scores from the normalized cohort’s gene expression data. The learned patterns are then associated with the putative transcription factor targets to identify genes that consistently exhibit Up/Down gene regulation patterns within the cohort. We applied this method to 11 publicly available GEO data sets related to osteoclastgenesis. RESULT: Our experiment identified a small number of Up/Down IRF8 and NFATc1 target genes as relevant to osteoclast differentiation. The machine learning models using GP and GA produced NFATc1 and IRF8 target genes different than simply using a log2 fold change alone. Our literature survey revealed that all predicted target genes have known roles in bone remodeling, specifically related to the immune system and osteoclast formation and functions, suggesting confidence and validity in our method. CONCLUSION: cTAP was motivated by recognizing that biologists tend to use Z score values present in data sets for the analysis. However, using cTAP effectively presupposes assembling a sizable cohort of gene expression data sets within a comparable context. As public gene expression data repositories grow, the need to use cohort-based analysis method like cTAP will become increasingly important. BioMed Central 2022-01-07 /pmc/articles/PMC8740472/ /pubmed/34991467 http://dx.doi.org/10.1186/s12864-021-08159-z Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle	Research Wang, Honglin Joshi, Pujan Hong, Seung-Hyun Maye, Peter F. Rowe, David W. Shin, Dong-Guk Predicting the targets of IRF8 and NFATc1 during osteoclast differentiation using the machine learning method framework cTAP
title	Predicting the targets of IRF8 and NFATc1 during osteoclast differentiation using the machine learning method framework cTAP
title_full	Predicting the targets of IRF8 and NFATc1 during osteoclast differentiation using the machine learning method framework cTAP
title_fullStr	Predicting the targets of IRF8 and NFATc1 during osteoclast differentiation using the machine learning method framework cTAP
title_full_unstemmed	Predicting the targets of IRF8 and NFATc1 during osteoclast differentiation using the machine learning method framework cTAP
title_short	Predicting the targets of IRF8 and NFATc1 during osteoclast differentiation using the machine learning method framework cTAP
title_sort	predicting the targets of irf8 and nfatc1 during osteoclast differentiation using the machine learning method framework ctap
topic	Research
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8740472/ https://www.ncbi.nlm.nih.gov/pubmed/34991467 http://dx.doi.org/10.1186/s12864-021-08159-z
work_keys_str_mv	AT wanghonglin predictingthetargetsofirf8andnfatc1duringosteoclastdifferentiationusingthemachinelearningmethodframeworkctap AT joshipujan predictingthetargetsofirf8andnfatc1duringosteoclastdifferentiationusingthemachinelearningmethodframeworkctap AT hongseunghyun predictingthetargetsofirf8andnfatc1duringosteoclastdifferentiationusingthemachinelearningmethodframeworkctap AT mayepeterf predictingthetargetsofirf8andnfatc1duringosteoclastdifferentiationusingthemachinelearningmethodframeworkctap AT rowedavidw predictingthetargetsofirf8andnfatc1duringosteoclastdifferentiationusingthemachinelearningmethodframeworkctap AT shindongguk predictingthetargetsofirf8andnfatc1duringosteoclastdifferentiationusingthemachinelearningmethodframeworkctap

Predicting the targets of IRF8 and NFATc1 during osteoclast differentiation using the machine learning method framework cTAP

Ejemplares similares